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INTRODUCTION 


Compared  to  European-American  (EA)  men,  African-American  (AA)  men  have  a  2-fold 
greater  risk  of  dying  from  metastatic  prostate  cancer  (1-2).  For  both  groups,  proper 
categorization  of  prostate  cancer  biopsies  as  high  or  low-risk  for  metastasis  at  the  time 
of  diagnosis  would  optimize  treatment,  improving  outcomes  and  minimizing  toxicity.  The 
Ostrer  laboratory  has  demonstrated  that  the  specific  genes  within  metastatic  prostate 
cancers  have  been  altered  by  amplification  (increase  in  the  copy  number)  or  deletion 
(decrease  in  the  copy  number)  (3).  These  genes  appeared  to  have  been  selected  by  the 
advantages  that  they  conveyed  to  tumors,  such  as  escape  from  cell  death  (‘anoikis’). 
These  amplified  or  deleted  metastasis  genes  are  enriched  2.5-fold  in  the  primary 
prostate  cancers  of  AA  men  -  a  degree  of  enrichment  that  is  similar  to  the  enhanced 
likelihood  of  metastasis.  The  current  study  is  designed  to  confirm  these  observations 
about  gene  patterns  predictive  of  metastatic  potential  in  new  cohorts  of  men  for  whom 
outcome  data  are  available  The  current  study  will  also  provide  DNA  sequence  of  the 
exomes  (expressed  part  of  the  genomes)  in  a  subset  of  these  tumors  and  a  risk  model 
that  can  be  used  for  categorizing  newly  diagnosed  prostate  cancers  as  high  or  low-risk 
for  metastasis.  These  methods  will  be  applied  to  prostate  cancer  biopsy  specimens  to 
demonstrate  that  they  could  be  used  at  the  time  of  diagnosis  for  prediction  of  outcome. 
This  study  will  be  beneficial  to  all  men  with  prostate  cancer,  because  it  will  provide  a 
diagnostic  tool  that  could  be  used  for  selection  of  therapy.  It  is  especially  beneficial  for 
African-American  men  who  have  a  greater  likelihood  of  disease  and  metastasis  and 
could  provide  a  precise  answer  for  the  challenging  problem  of  this  health  disparity. 

KEYWORDS 

Prostate  cancer,  metastasis,  African-American  men,  health  disparity,  genomics,  copy 
number  alteration,  predictive  signature. 
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OVERALL  PROJECT  SUMMARY 


Year  1:  The  main  efforts  during  the  first  year  of  the  project  were  review  of  the  clinical 
data  for  the  subjects  in  the  study  to  verify  their  inclusion,  selection  of  formalin-fixed 
paraffin-embedded  (FFPE)  tissue  blocks,  and  macrodissection  of  tumor  or  normal  tissue 
for  genetic  analysis.  Notably,  IRB  approval  was  secured  from  Duke  and  authorized  by 
the  U.S.  Army  Medical  Research  and  Materiel  Command  Human  Research  Protection 
Office.  Cases  were  identified  among  men  who  received  radical  prostatectomy  for 
prostate  cancer  and  who  had  accurate  long-term  follow-up  information.  These  that  had 
distant  metastases  have  been  frequency  matched  to  men  cured  by  radical 
prostatectomy  by  age  (within  5-years),  race  (EA  vs.  AA  men),  pathological  stage  (exact 
match),  margin  status  (exact  match),  grade  (Gleason  score,  exact  match),  surgery  year 
(within  3  years),  PSA  (<10,  10-20,  and  >20)  and  location  (North  Carolina  versus  New 
York).  There  is  currently  no  accepted  definition  of  “cured”  after  surgery,  since  late 
recurrences  occur  and  the  cure  by  surgery  subgroup  will  undoubtedly  be  confounded  by 
high-risk  primary  tumors.  For  this  study,  we  have  used  PSA  <0.2  ng/ml  five  years  after 
surgery  as  a  surrogate  marker  for  “cured”  because:  (a)  PSA  recurrences  (PSA  >0.2 
ng/ml)  are  uncommon  after  5  years  and  (b)  Even  when  PSA  recurrences  do  occur  after 
5  years,  they  are  rarely  fatal  (4). 

Among  the  identified  cases,  the  Pathology  Departments  at  Duke  and  Einstein  reviewed 
the  pre-existing  H&E  slides  for  evidence  of  cancer.  The  pathologists  selected  the  two 
blocks  with  the  highest  tumor  content  and  one  that  was  tumor  free.  We  retrieved  the 
corresponding  FFPE  tissue  blocks  and  cut  12  slices  each  of  5  micron  thickness.  These 
sections  were  placed  in  2  ml  Eppendorf  tube,  bar-coded  with  a  unique  de-identified 
code  for  each  patient  and  assembled  for  genomic  analysis. 

The  biomarkers  are  copy  number  alterations  (CNAs)  detected  by  molecular  inversion 
probe  (MiPS)  technology  using  the  Affymetrix  Oncoscan  v2  SNP  array  developed 
specifically  for  genomic  DNA  samples  extracted  from  FFPE  tissues.  This  array  has 
been  applied  to  more  than  5000  samples  with  an  average  pass  rate  of  92%.  Among  the 
features  of  the  method  are  a  wide  dynamic  range  (0-60  copies)  and  interrogation  of  the 
entire  genome  by  analysis  of  more  than  335,000  markers.  To  assess  the  validity  of  our 
metastasis  signature  and  MPS  prediction  model,  we  tested  a  Duke  cohort  that  was 
made  up  of  a  group  of  primary  tumors  that  metastasized  following  radical  prostatectomy 
(mPT,  n=12),  a  group  of  high-risk  tumors  that  did  not  develop  distant  metastases 
(hiPTs,  n=8),  and,  a  group  of  low-risk  tumors  that  did  not  develop  distant  metastases 
(iPTs,  n=7).  The  high-risk  designation  of  the  hiPT  groups  was  assigned  based  on 
whether  the  patient  experienced  biochemical  recurrence  and  received  adjuvant 
radiation  and/or  hormone  therapy  after  surgery  whereas  the  iPTs  represent  tumors  of 
men  that  were  considered  low-risk  and  did  not  receive  adjuvant  therapy. 

The  MPS  score  was  calculated  for  the  Duke  cohort  (Figure)  and  shown  to  distribute  as 
expected  for  mPTs,  iPTs  and  hiPTs.  The  receiver  operating  characteristics-area  under 
the  curve  analysis  (ROC-AUC)  applied  only  to  the  Duke  cohort  mPTs  and  iPTs  resulted 
in  an  accuracy  of  0.91 .  The  Duke  cohort  mPTs  and  hiPTs/iPTs  pooled  with  the  surgical 
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validation  set  previously  described  resulted  in  a  0.77  accuracy  as  measured  by  the 
ROC-AUC. 


Figure.  Boxplots  of  MPS  score  (Y-axis)  of  primary  tumor  samples  from  the  Duke  cohort 
validation  study  (right  panel)  shown  relative  to  previously  studied  cohorts  (left  and 
middle  panels)  (3).  METS  are  metastases.  mPTs  are  primary  tumors  that  went  on  to 
metastasize.  Ln+PTs  are  tumors  that  spread  to  regional  lymph  nodes.  Control  PTs  are 
primary  tumors  whose  natural  history  is  unknown.  Cell  lines  are  derived  from  tumors  of 
various  origins. 
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Thus,  the  every  step  in  the  subject  identification,  tumor  assessment,  block  retrieval, 
dissection,  DNA  extraction,  Oncoscan  v2  array  and  data  analysis  met  our  expectation 
and  suggested  that  we  could  meet  the  goals  of  Specific  Aim  1  of  our  study. 

To  identify  samples  that  were  suitable  for  whole  exome  sequencing  (Aim  3),  we 
selected  6  matched  pairs  of  tumor-normal  that  were  frozen  immediately  following 
resection.  As  outcome  information  is  not  available  for  these,  we  chose  to  identify  those 
with  high  and  low  metastatic  potential  scores,  based  on  CNA  profiles.  To  do  so,  the 
samples  were  dissected,  DNA  was  extracted  and  Affymetrix  v6  arrays  were  run,  then 
MPS  scores  were  calculated.  Among  these  1  was  identified  to  have  high  MPS  scores 
and  5  were  identified  to  have  low  MPS  scores.  From  these,  ImPT  sample  and  1  iPT 
sample  were  selected  for  whole  exome  sequencing. 
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KEY  RESEARCH  ACCOMPLISHMENTS 


In  a  replication  study,  MPS  score  was  shown  to  be  an  accurate  predictor  of  metastatic 
risk  (0.91  ROC-AUC)  using  Oncoscan  v2  arrays. 

CONCLUSION 

The  salient  feature  of  this  study  involves  translation  of  the  basic  research  of  tumor 
biology  into  a  risk  model  that  can  provide  informed  clinical  decisions  for  men  with 
prostate  cancer  and  their  physicians.  This  will  determine  whether  prostate  cancers  are 
treated  aggressively,  because  they  are  deemed  to  have  high  metastatic  potential,  or 
whether  they  are  treated  with  active  surveillance,  because  they  are  deemed  indolent. 
These  findings  are  now  being  applied  to  studying  the  health  disparity  of  prostate  cancer 
metastasis. 
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The  transition  of  cancer  from  a  localized  tumor  to  a  distant  metastasis  is  not  well  understood 
for  prostate  and  many  other  cancers,  partly,  because  of  the  scarcity  of  tumor  samples,  especially 
metastases,  from  cancer  patients  with  long-term  clinical  follow-up.  To  overcome  this  limitation, 
we  developed  a  semi-supervised  clustering  method  using  the  tumor  genomic  DNA  copy  number 
alterations  to  classify  each  patient  into  inferred  clinical  outcome  groups  of  metastatic  potential.  Our 
data  set  was  comprised  of  294  primary  tumors  and  49  metastases  from  5  independent  cohorts  of 
prostate  cancer  patients.  The  alterations  were  modeled  based  on  Darwin's  evolutionary  selection 
theory  and  the  genes  overlapping  these  altered  genomic  regions  were  used  to  develop  a  metastatic 
potential  score  for  a  prostate  cancer  primary  tumor.  The  function  of  the  proteins  encoded  by 
some  of  the  predictor  genes  promote  escape  from  anoikis,  a  pathway  of  apoptosis,  deregulated 
in  metastases.  We  evaluated  the  metastatic  potential  score  with  other  clinical  predictors  available 
at  diagnosis  using  a  Cox  proportional  hazards  model  and  show  our  proposed  score  was  the  only 
significant  predictor  of  metastasis  free  survival.  The  metastasis  gene  signature  and  associated  score 
could  be  applied  directly  to  copy  number  alteration  profiles  from  patient  biopsies  positive  for 
prostate  cancer. 
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1.  Introduction 

Prostate  cancer  is  a  common  public  health  problem.  In  2012,  this  disease  was  expected  to 
be  diagnosed  in  an  estimated  241,740  men  (29%  of  all  male  cancers)  and  to  result  in  28,170 
deaths  (9%  of  male  cancer  deaths)  [1],  If  left  untreated,  around  70%  of  prostate  cancers 
remain  asymptomatic  and  indolent  for  decades  [2],  If  treated  with  radical  prostatectomy 
or  radiation  therapy,  the  risk  of  metastasis  is  reduced,  but  erectile  dysfunction,  urinary 
incontinence,  and  rectal  bleeding  may  occur,  affecting  the  patient's  quality  of  life.  Because 
it  is  currently  difficult  to  determine  accurately  which  patients  will  develop  metastatic 
disease,  physicians  treat  patients  with  mid-to-late  stage  local  disease  aggressively,  even  when 
such  treatment  may  not  be  required.  Clinical  parameters,  such  as,  serum  concentration  of 
prostate-specific  antigen  (PSA),  extension  beyond  surgical  margins,  invasion  of  seminal 
vesicles,  extension  beyond  the  capsule,  surgical  Gleason  score,  prostate  weight,  race,  and 
year  of  surgery,  are  employed  in  existing  nomograms  for  prediction  of  local  recurrences 
after  surgery  [3],  but,  many  of  these  parameters  are  not  available  at  diagnosis  and  cannot 
be  used  for  guiding  therapeutic  decisions.  Development  of  a  robust  risk  model  from 
a  biopsy  that  accurately  predicts  the  potential  of  a  local  prostate  cancer  to  metastasize 
would  justify  aggressive  treatment  in  high-risk  cases  and  improve  the  quality  of  life 
for  men  with  indolent  disease  by  allowing  them  to  avoid  treatment-related  side  effects. 
Thus,  the  goal  of  this  study  was  to  develop  a  method  to  identify  tumor  genomic 
biomarkers  that  could  be  applied  to  prediction  models  that  help  guide  clinical  treatment 
decisions. 

The  method  chosen  for  developing  the  predictive  model  was  the  analysis  of  genomic 
DNA  copy  number  alterations  (CNAs)  in  prostate  cancers,  because  these  cancers  have 
long  been  known  to  harbor  multiple  genomic  imbalances  that  result  from  CNAs  [4,  5]. 
High-resolution  measurements  of  CNAs  have  functional  value,  in  some  cases  providing 
evidence  for  alterations  in  the  quantity  of  normal,  mutant,  or  hybrid-fusion  transcripts  and 
proteins  in  the  cancer  cells.  The  resulting  changes  in  abundance  or  altered  structure  of  RNA 
transcripts  and  proteins  (e.g.,  truncating  dominant  negative  mutations)  may  impact  the 
fitness  of  the  cell  and  provide  some  of  the  mechanisms  necessary  for  distant  site  migration, 
invasion,  and  growth.  From  the  multiple  CNAs  identified  in  tumors,  CNA-based  gene 
signatures  were  developed  into  a  score  that  suggested  the  ability  to  predict  metastasis  free 
survival. 


2.  Methods 

2.1.  Cohorts  and  Samples 

We  studied  four  publically  available  prostate  cancer  cohorts  and  a  fifth  cohort  reported  here: 
(1)  294  primary  tumors  and  matched  normal  tissue  samples  from  NYU  School  of  Medicine 
(NYU  n  =  29),  Baylor  College  of  Medicine  (Baylor  n  =  20)  [6],  Memorial  Sloan-Kettering 
Cancer  Center  (MSK  n  =  181)  [7],  and  Stanford  University  (SU  n  =  64  (single  reference 
used  for  each  tumor))  [8].  (2)  49  metastatic  tumors  and  matched  normal  samples  from  Johns 
Hopkins  School  of  Medicine  (Hopkins  n  =  13)  [9]  and  MSK  (n  =  36)  [7].  The  13  patients  in 
the  Hopkins  cohort  had  multiple  metastases  dissected  at  autopsy,  totaling  55  samples  for  the 
study.  We  also  studied  a  sixth,  publically  available  cohort  of  337  cell  lines  originating  from 
varying  tumor  cell  types  (ArrayExpress  ID:  E-MTAB-38). 
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Genomic  DNA  (gDNA)  from  the  NYU  cohort  was  extracted  from  fresh-frozen  prostate 
tumors  using  a  Gentra  DNA  extraction  kit  (Qiagen).  Purified  gDNA  was  hydrated  in  reduced 
TE  buffer  (10  mM  Tris,  0.1  mM  EDTA,  pH  8.0).  The  gDNA  concentration  was  measured  using 
the  NanoDrop  2000  spectrophotometer  at  optical  density  (OD)  wavelength  of  260  nm.  Protein 
and  organic  contaminations  were  measured  at  OD  280  nm  and  230  nm,  respectively.  Samples 
that  passed  OD  quality  control  thresholds  were  then  run  on  a  1%  agarose  gel  to  assess  the 
integrity  of  the  gDNA.  500  ng  of  gDNA  samples  was  run  on  the  Affymetrix  Human  SNP 
Array  6.0  at  the  Rockefeller  University  Genomics  Resource  Center  using  standard  operating 
procedures.  Samples  that  were  obtained  from  public  sources  were  processed  according  to  the 
methods  outlined  in  their  respective  publications.  Affymetrix  .cel  files  were  processed  using 
the  Birdseed  v2  algorithm  [10]. 


2.3.  Study  Design 

The  case  samples  in  this  study  were  either  metastatic  tumors  (METS)  or  primary  tumors 
from  men  treated  with  radical  prostatectomy  that  were  clinically  followed  up  and  reported  to 
develop  distant  metastases  (mPTs) .  METS  and  mPTs  are  clearly  discernible  phenotypes  that 
can  be  classified  unequivocally  as  cases.  The  control  samples  were  defined  as  primary  tumors 
that  had  not  progressed  to  form  distant  metastases  following  radical  prostatectomy  either 
because  clinical  followup  was  not  available  or  because  the  treatment  rendered  the  patient 
not  informative  for  this  outcome.  Radical  prostatectomy  treats  both  indolent  primary  tumors 
(iPTs)  that  would  not  metastasize  and  primary  tumors  that  would  otherwise  progress  to  form 
metastases,  if  left  untreated.  Thus,  the  control  primary  tumors  actually  represent  a  mixture  of 
iPTs  and  unrealized  mPTs.  Assuming  a  randomly  sampled  cohort,  it  is  expected  that  about 
30%  of  the  control  group  of  primary  tumors  would  be  unrealized  mPTs  [2],  Considering 
the  scarcity  of  clinically  informative  mPTs  and  iPTs  for  study,  our  strategy  for  identifying 
CNA  biomarkers  from  tumors  with  inferred  metastatic  outcomes  allowed  a  greater  number 
of  individual  genomes  to  be  used.  Accordingly,  all  of  the  clinically  informative  mPTs  available 
to  us  were  not  used  to  identify  the  biomarkers  and  only  tested  in  a  Cox  proportional  hazard 
model  to  assess  the  clinical  usefulness  of  these  predictors.  Future  tumor  cohort  study  design 
using  the  method  presented  in  this  paper  should  consider  the  prevalence  of  metastatic 
progression  to  assure  a  large  enough  representation  of  both  mPTs  and  iPTs.  The  natural 
history  of  prostate  cancer,  without  medical  intervention,  (e.g.,  watchful  waiting  or  active 
surveillance)  is  well  documented  [2],  Assuming  a  randomly  sampled  cohort,  this  information 
allowed  us  to  estimate  the  prevalence  of  mPTs  to  be  30%. 


2.4.  Cancer  Genomics  Copy  Number  Algorithm 

A  genomic  DNA  copy  number  analysis  pipeline  (Figure  1)  was  designed  using  the 
R-statistical  software  [11]  (R)  to  process  the  raw  intensity  data  through  a  series  of 
computational  steps  resulting  in  ranked  lists  of  genes  and  associated  significance  that  could 
be  used  for  functional  mining  and  prediction  model  development.  The  R-package  will  be 
provided  upon  request  and  raw  and  processed  data  can  be  obtained  from  Gene  Expression 
Omnibus  accession#  GSE27105. 
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Cancer  genomics  pipeline  overview 
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Figure  1:  Array  CGH  analysis  pipeline  for  processing  pixel  image  data  from  Affymetrix  SNP  arrays  to 
produce  genotype  and  signal  intensity  measures  for  copy  number  analysis  used  for  developing  bioclinical 
models  and  diagnostics. 


2.5.  Raw  Data  Processing 

Signal  intensity  files  (.cel)  for  the  Affymetrix  SNP  Array  6.0  or  500  k  mapping  arrays  were 
processed  using  the  Affymetrix  Power  Tools,  Birdseed  V2  [10],  and  BRLMM  [12]  algorithms, 
respectively,  resulting  in  genotype  allele  calls  and  signal  intensity  measures  for  each  SNP  and 
copy  number  probe.  After  the  first  stage,  the  genotype  calls  were  prepared  for  downstream 
principal  component  analysis  for  ethnic  identification  and  quality  control  testing,  especially 
important  when  investigating  racially  driven  health  disparities  (Figure  2).  Men  of  African 
descent  have  an  increased  incidence,  earlier  onset,  and  more  aggressive  form  of  the  disease 
than  those  of  European  origin.  Even  when  adjusted  for  the  increased  level  of  incidence  in 
African  Americans,  the  mortality  rate  of  African  American  men  is  more  than  twice  that 
of  Caucasian  men  [1],  Although  not  presented  in  the  current  work,  sophisticated  CNA 
models  of  metastatic  disease  may  provide  a  biological  explanation  for  the  epidemiological 
observations  of  racial  health  disparity  of  metastasis. 

The  probe-summarized  intensity  signals  were  log  transformed  and  standardized 
(mean  centered,  standard  deviation  scaled)  on  an  individual  array  basis  and  the  relative  copy 
number  was  calculated  by  subtracting  the  normal  from  the  tumor  intensity  for  each  patient 
on  a  probe  basis.  The  resulting  copy  number  profile  (CN)  represented  the  amplification  and 
deletion  events  that  accumulated  in  each  cancer  sample  tested. 

Next,  the  probes  were  ordered  as  they  appear  in  the  genome  and  the  copy  number 
signal  data  (CN)  was  smoothed.  The  smoothing  was  conducted  using  a  running  median 
function  (runmed  in  R,  with  endrule  parameter  equal  to  "median") .  The  smoothing  function 
was  termed  S(CN)fc,  where  k  represents  the  probe  width  of  the  smoothing  window.  The 
values  of  k  usually  range  from  5  to  151,  depending  on  the  array's  probe  density  and  were 
chosen  not  to  exceed  a  biologically  meaningful  span  of  total  genetic  distance.  Considerations 
for  k  should  include  the  average  alteration  size  (estimated  empirically  from  each  data  set) 
and  distance  between  probes  as  determined  by  the  array  probe  density.  As  an  extreme 
example,  smoothing  the  entire  arm  of  a  chromosome  will  remove  all  local  variation  that 
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Prostate  cancer  sample  ethnicity  validation 
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Figure  2:  Principal  component  analysis  identity  testing  of  a  variety  of  normal  SNP  profiles  from  the 
germline  DNA  of  prostate  cancer  patients  (PC)  used  in  the  study  compared  to  a  set  of  HapMap  normal 
reference  populations  from  Nigeria  (YRI),  Europe  (CEU),  China  (CEIB),  or,  of  African  American  (AFA) 
decent.  The  x  and  y  axes  represent  the  1st  and  2nd  eigenvectors. 


exists  on  that  arm.  The  function  S(CN)fc  thus  yielded  n  smoothing  profiles  per  sample,  with  n 
representing  the  number  of  different  values  used  for  k.  An  example  of  the  multiple  n  values 
used  for  chromosome  1  of  a  particular  sample  is  shown  in  Figure  3. 


2.6.  Copy  Number  Alteration  Calling  Algorithm 

The  next  part  of  this  stage  involved  assigning  copy  number  events  to  each  probe.  The 
reason  we  developed  a  CNA  caller  from  scratch  was  because  the  standard  calling  algorithms 
required  parameter  inputs  that  were  dependent  on  the  signal-to-noise  distribution  of  the  copy 
number  measures.  Because  cancer  samples'  signal-to-noise  are  notoriously  variable,  both  on  a 
chromosome  basis  (within  a  sample  profile)  and  across  samples,  this  made  the  standard  CNA 
calling  approaches  inefficient  without  significant  reconfiguration.  Therefore,  we  developed 
a  method  that  was  dynamic  to  the  signal-to-noise  variation  observed  in  cancer  genomes. 
We  validate  the  effectiveness  our  approach  (Figure  4)  using  a  benchmark  simulation  data 
set  used  to  test  a  variety  of  algorithms  [13].  Given  that  SNP  arrays  are  not  designed  to 
provide  quantitative  measures  of  copy  number  (but  do  respond  linearly  to  CNAs),  we  restrict 
our  calls  to  three  categories:  amplifications  (1),  deletions  (-1),  and  neutral  events  (0).  To 
determine  the  "center"  of  the  genome  so  that  thresholds  can  be  drawn,  we  assume  that  a 
majority  of  the  intensity  values  reflect  a  2-copy  state  for  the  referenced  sample,  that  is,  the 
majority  of  the  referenced  tumor  sample  exists  in  a  2-copy  state  (manual  calling  is  used 
for  those  samples  in  which  this  assumption  is  not  valid).  To  accomplish  this,  we  sample 
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Figure  3:  A  representative  primary  tumor  chromosome  1  copy  number  profile  (top  panel)  and 
corresponding  S(CN)n  [k  =  {9,49,99}]  in  the  bottom  panels.  Therefore,  n  =  3  because  three  different 
smoothing  lengths  are  used.  Black  probes  represent  probes  that  are  not  called  while  red  probes  are  the 
called  events  that  exceed  the  amplification  and  deletion  thresholds. 


10,000  random  stretches  of  probes  covering  approximately  500  kilobases  from  the  autosomes, 
calculate  the  median  of  each,  and  use  the  most  frequently  occurring  value  to  scale  the  sample 
appropriately.  Following  scaling  of  the  genome,  thresholds  were  drawn  based  on  quantile 
values  and  copy  number  states  were  assigned  to  each  probe.  Since  this  thresholding  scheme 
was  applied  to  every  smoothing,  there  were  n  event  calls  per  probe.  These  calls  result  in  a  "  p" 
profile,  where  T ()  represents  the  function  of  trinary  binning: 


Pk  =  T(S{  CN)fc). 


(2.1) 


The  np  calls  for  each  probe  were  then  combined  by  summation,  resulting  in  a  composite 
profile  {p')  that  ranged  from  —n  (signifying  that  a  deletion  was  called  at  every  smoothing 
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Figure  4:  Receiver-operating  characteristic  curves  showing  the  performance  of  our  CNA-calling  algorithm 
on  the  simulated  data  [13].  Each  panel  represents  a  different  signal-to-noise  ratio  and  the  curves  represent 
varying  event  widths  of  the  simulated  data.  The  x-axis  represents  the  false  positive  rate,  and  the  y-axis 
represents  the  true  positive  rate.  Each  curve  is  generated  by  testing  varying  thresholds  on  100  simulated 
chromosomes  for  the  condition  specified.  The  curves  are  combined  using  vertical  averaging.  The  dashed 
line  represents  the  random  model. 


for  that  probe)  to  +n  (signifying  that  an  amplification  was  called  at  every  smoothing  for  that 
probe) : 


P'  =  I>-  <2-2) 

i=l 

One  p'  profile  was  thus  generated  per  sample,  representing  a  composite  of 
n  smoothings,  and  this  metric  was  used  for  the  rest  of  the  primary  analysis.  We 
benchmarked  our  copy  number  calling  method  using  a  published  simulation  data  set 
[13]  comprised  of  randomly  generated  artificial  chromosomes.  Each  chromosome  was 
generated  with  an  aberration  flanking  the  center  probe  with  Gaussian  noise  At(0,0.252) 
superimposed.  All  combinations  of  signal  to  noise  (SN  =  4, 3, 2,  and  1)  and  aberration  widths 
(W  =  40,20,10,  and  5)  were  produced  for  a  total  of  160,000  analysis  runs.  Receiver- 
operating  characteristics  (ROC)  were  computed  from  the  benchmark  simulation  dataset  [13]; 
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Figure  5:  Copy  number  profile,  p"  shows  an  amplification  of  a  region  on  chromosome  X  harboring  the 
androgen  receptor  (AR)  locus.  The  x-axis  represents  the  ordered  chromosome  position  and  the  y-axis 
represents  standardized  population  frequencies  exhibiting  amplifications  (above  0)  and  deletions  (below 
0).  The  three  populations  of  tumors  are  represented  as  red,  black,  or  green  lines  for  mPTs,  androgen 
ablation  treated  metastases  (METS),  and  iPTs,  respectively. 


where  ROC  is  defined  as  a  pair,  ROC  =  (TPR,  FPR),  TPR  =  (the  number  of  probes  within 
the  aberration  width  that  is  above  a  threshold)  /  (the  total  number  of  probes  within  the 
aberration  width) .  FPR  =  (the  number  of  probes  outside  the  aberration  width  that  is  above  a 
threshold) / (the  total  number  of  probes  outside  the  aberration  width).  The  threshold  values 
are  selected  to  continuously  range  over  the  values  of  the  data  points,  and  since  ROC  is  piece- 
wise  constant,  only  changing  when  a  threshold  is  equal  to  the  value  of  a  data  point,  we  only 
need  to  consider  values  of  the  data  points  in  their  sorted  order.  The  area  under  the  curve 
(AUC)  of  each  ROC  curve  was  used  to  gauge  performances. 

To  examine  the  frequency  of  amplification  and  deletions  for  subgroups  of  samples  or 
populations  and  evaluate  the  sensitivity  of  our  CNA-calling  method,  we  further  combined 
the  p'  data  to  create  p"  by  summing  across  the  p'  profiles  on  a  probe  basis  across  multiple 
samples.  Two  values  of  p"  were  calculated  for  population  or  subpopulation.  The  first  value 
represented  the  sum  of  all  positive  p'  values  in  the  population  at  any  probe  and  was  thus 
called  p" mp-  Likewise,  the  second  value  representing  the  sum  of  all  negative  p'  values  in  the 
population  at  any  probe  was  called  p"dep. 


n  samples 

/^ampjdel  ^[amp|del]’ 


(2.3) 


An  example  of  copy  number  p"  plot  (Figure  5)  is  observed  in  a  select  region  on  chromosome 
X  from  metastases  of  men  treated  with  androgen  ablation  therapy  and  primary  tumors  of 
iPTs  and  mPTs  from  other  men  not  treated.  Furthermore,  differential  analysis  of  the  p"  values 
can  be  used  to  identify  probes  or  regions  of  probes  that  comprise  genes  that  may  contribute  to 
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the  phenotype  being  tested  (e.g.,  iPT  versus  mPT  or  response  to  therapy  versus  no  response 
to  therapy) . 

2. 7.  Semisupervised  Clustering  Algorithm 

Since  sufficient  labels  were  not  available  to  train  a  model  from  primary  tumors  alone,  we 
first  created  from  a  cohort  of  men  that  developed  distant  metastases  a  simplified  summary 
metastasis  profile  to  capture  the  high-frequency  events,  that  are  in  part,  assumed  to  correlate 
to  the  outcome.  This  clustering  approach  is  not  unsupervised,  class-less  clustering  because 
we  know  some  information  about  one  of  the  components  which  is  the  summary  profile 
from  known  metastasis  samples.  To  reflect  the  frequency  of  events  observed  for  individual 
metastasis  CNA  profiles  in  the  summary  metastasis  profile,  the  average  number  of  p'  events 
calculated  for  the  group  of  metastases  was  used  to  set  a  threshold  for  the  number  of  total 
p'  events  used  to  build  the  summary  metastasis  profile.  The  actual  probes  chosen  for  the 
metastasis  summary  profile  were  based  on  their  ranked  frequency  which  resulted  in  a 
threshold  of  at  least  25%  of  the  samples  exhibiting  the  event.  Although  not  tested  here,  the 
theoretical  specificity  of  the  summary  profile  is  expected  to  decrease  as  the  threshold  for 
minimum  number  of  events  called  decreases,  while  the  sensitivity  of  the  profile  decreases 
as  the  threshold  of  minimum  number  of  events  called  increases.  In  the  case  of  the  MSK 
cohort,  clustering  of  the  36  metastases  p'  profiles  independently  yielded  two  well-separated 
clusters  from  which  we  built  two  metastasis  summary  profiles  to  perform  semisupervised 
clustering  with  the  primary  tumors.  Alternatively,  the  13-patient  Hopkins  cohort  made  up  of 
55  metastases  yielded  only  one  homogeneous  cluster  and  associated  summary  metastasis 
profile.  To  overcome  the  inherent  variability  with  clustering  algorithms,  we  employed  a 
resampling  hierarchical  clustering  method  to  infer  an  initial  grouping  for  the  unclassified 
primary  tumors.  For  each  iteration,  a  subset  of  the  individual  p'  profiles  from  the  unknown 
primary  tumors  were  randomly  chosen  with  replacement  and  clustered  with  the  summary 
copy  number  profile  derived  from  the  metastasis  samples  (one  metastasis  summary  profile 
from  the  Hopkins  cohort  and  two  from  MSK  cohort).  Therefore,  the  semisupervised 
clustering  analysis  presented  here  was  developed  to  classify  prostate  primary  tumors  into 
subgroups  with  different  metastatic  potential  (mPT  and  iPT)  based  on  their  CNA  profiles. 
Distance  was  calculated  using  a  binary  metric,  and  the  samples  were  joined  using  hierarchical 
clustering  (complete-linkage  method).  The  cluster  tree  was  divided  into  two  groups  at  the 
final  join,  and  the  primary  tumor  samples  were  scored  1  if  they  fell  in  the  same  cluster  as 
the  metastasis  profile,  and  0  if  they  were  in  the  other  cluster.  Using  the  results  from  20,000 
resampling  iterations  of  the  clustering,  a  proximity  score  was  generated  for  each  sample, 
representing  the  number  of  times  it  fell  in  a  cluster  with  the  metastasis  profile.  A  sample  with 
a  high  score  was  considered  to  be  more  metastatic  (mPT),  while  lower  scoring  tumors  were 
more  indolent  (iPT) .  The  similarity  scores  distributed  throughout  the  possible  range  of  values 
(0  to  1),  allowing  us  to  form  distinct  groups  of  tumors  with  significant  contrast  between 
high-  and  low-metastatic  distance  to  MSK  metastasis  signature  1  (Figure  6).  The  group  of 
samples  with  scores  closer  to  the  center  of  the  distribution  were  omitted  to  further  define  the 
contrast  between  high-  and  low-scoring  samples. 

2.8.  Metastasis  Genes  Inferred  through  Evolutionary  Selection  Modelling 

Genomic  DNA  copy  number  alterations  in  local  and  metastatic  prostate  tumors  are  typically 
numerous,  systematic  in  their  genomic  placement  and  varied  in  size  from  point  mutations 
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Figure  6:  Plot  of  ranked  proximity  score  for  MSK  signature  2.  Proximity  represents  the  number  of  times  a 
particular  sample  clustered  with  the  MSK  metastasis  profile  2.  The  samples  with  higher  scores  (red  points) 
are  classified  as  inferred  mPTs  and  the  samples  with  lower  scores  (blue  points)  are  classified  as  inferred 
iPTs.  Primary  tumors  (hollow  points)  interspersed  with  the  blue  iPT  tumors  were  excluded  as  iPTs  for 
MSK  signature  2  because  they  did  not  consistently  classify  as  iPTs  in  the  proximity  analysis  using  MSK 
signature  1. 


to  duplications  or  deletions  of  entire  chromosomes.  Given  these  observations,  geneticists 
have  postulated  that  Darwinian  selection  may  operate  on  the  genomic  instability  in  tumors 
[14].  High-resolution  measurements  of  CNAs  in  somatic  tumors  have  informative  value, 
in  some  cases  reflecting  the  direction  in  which  the  biochemistry  of  the  cell  controls  the 
quantity  of  normal,  mutant,  or  hybrid-fusion  transcripts  and  proteins.  During  this  genomic 
transformation,  the  resulting  modified  transcripts  and  proteins  may  impact  the  fitness  of  the 
cell.  Guided  by  these  principles  of  evolutionary  selection,  our  analyses  sought  to  identify  the 
CNA  landscape  that  reflects  selection  mechanisms  of  metastasis.  Genomic  selection  towards 
a  metastatic  cancer  phenotype  can  be  both  positive  and  negative  and  be  observed  in  CNAs 
exhibiting  both  amplifications  and  deletions.  For  example,  genes  that  promote  metastasis  and 
amplified  in  metastatic  tumors  would  reflect  positive  selection,  while  metastasis  suppressor 
genes  that  are  deleted  in  metastases  reflect  negative  selection.  The  genes  associated  with 
these  regions,  altered  at  high  frequency  in  metastatic  tumors  and  enriched  in  mPTs  more  so 
than  iPTs,  lead  to  enhanced  metastatic  potential.  We  identified  specific  CNAs  that  selected 
positively  for  metastatic  potential,  exhibiting  amplifications  in  metastases  and  mPTs  and 
deletions  in  iPTs.  CNAs  identified  to  exhibit  negative  selection  for  metastatic  potential  were 
observed  to  be  deleted  in  metastases  and  mPTs  and  amplified  in  the  iPTs.  Therefore,  we 
designed  models  based  on  Darwin's  evolutionary  selection  theory  to  score  positive  and 
negative  selection  based  on  the  mPT  and  iPT  classifications  derived  through  semisupervised 
clustering  using  the  p'  data.  For  each  probe  on  the  array,  we  calculated  an  enrichment  score. 
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Figure  7:  Scatterplot  of  the  enrichment  scores  of  the  METS  versus  those  of  mPTs,  normalized  by  the 
enrichment  scores  of  iPTs.  Kernel  density  estimation  curves  are  shown  protruding  from  the  x  and  y  axes. 
The  horizontal  and  vertical  dashed  red  lines  denote  the  trim  points  (quantiles  0.99  and  0.01).  A  linear 
regression  line,  based  on  the  trimmed  values,  is  shown  in  blue.  The  value  of  q  is  the  Pearson  correlation 
coefficient  for  the  trimmed  x  and  y  values. 


EN(x),  which  represented  the  relative  number  of  amplifications  versus  deletions,  observed 
in  each  subgroup  (metastasis,  mPT  and  iPT) : 


EN(x)  = 


(#Amp  -  #Del) 
#Samples 


(2.4) 


Next,  we  modeled  the  relative  enrichment  by  contrasting  the  metastasis  and  mPT  copy 
number  alterations  with  those  observed  in  the  iPT  group: 


SM  =  g[EMMETS)+g*EN(mPT)-EN(iPT)] 


(2.5) 


The  first  two  enrichment  terms  (for  metastatic  and  metastatic-like  samples)  being 
summed  were  designed  to  assign  a  higher  score  when  the  METS  and  mPT  samples  had 
more  amplifications  than  deletions.  Greater  amplification  enrichment  in  the  METS  and 
mPTs  resulted  in  higher  scores.  The  third  term,  EN(iPT),  was  higher  when  the  iPT  samples 
exhibit  the  opposite  effect  (enrichment  for  deletions  over  amplifications).  The  middle 
term,  EN(mPT),  was  multiplied  by  a  data-driven  coefficient,  q,  representing  the  average 
contribution  of  mPT  on  a  probe  basis  (Figure  7) . 

For  example,  probes  that  were  amplified  in  all  metastases  and  mPTs  but  deleted  in  all 
iPTs  (positive  selection  driving  the  metastasis  cells)  would  yield  the  highest  possible  score. 
Likewise,  probes  that  were  deleted  in  all  metastases  and  mPT  samples,  but  amplified  in 
all  iPT  samples  (negatively  select  or  inhibit  the  promotion  of  the  metastasis  cells),  would 
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reach  the  minimum  possible  score.  Therefore,  regions  of  the  genome  that  enhance  and  inhibit 
metastasis  formation  will  be  captured  by  our  evolutionary  selection  model. 

Following  this  probe  scoring  method  we  developed  a  Z-score  model  in  order  to 
extend  this  analysis  to  the  gene  level.  We  assign  each  probe  to  a  gene,  provided  it  falls 
within  10,000  bp  up-  or  downstream  of  the  transcription  start  or  stop  site.  The  SM  scores 
for  the  probes  within  a  gene  are  averaged  and  compared  to  the  mean  and  standard  deviation 
of  a  background  distribution,  which  was  calculated  by  sampling  the  top  5th  percentile  of 
amplified  or  deleted  probes  from  all  genes  on  the  array  with  the  same  number  of  probes  as 
the  gene  in  question.  The  result  is  a  Z-score  for  each  gene  in  the  genome  that  is  represented 
on  the  array. 

2.9.  Metastatic  Potential  Score  and  Survival  Analysis 

We  developed  an  algorithm  based  on  genomic  CNAs  to  calculate  a  metastatic  potential 
score  (MPS),  with  a  higher  score  indicating  a  greater  likelihood  of  metastasis.  The  MPS 
score  for  a  new  individual  patient  only  depends  on  the  CNA  profile  of  this  new  patient. 
It  can  be  calculated  without  requirement  for  other  samples,  since  it's  simply  based  on  the 
concordance/discordance  relationship  to  the  CNA  metastasis  gene  signature  previously 
identified  as  selecting  for  the  metastatic  phenotype  through  our  selection  model.  The  MPS 
was  calculated  using  a  weighted  Z  score  from  the  top  set  of  CNAs  overlapping  metastasis 
genes  determined  by  the  significance  of  their  selection  model  Z  scores.  We  used  Z  >  1.7 
as  a  cutoff  point  because  for  standard  normal  distribution,  the  tail  of  1.7  is  about  5%.  The 
metastatic  potential  score  was  defined  as  the  following: 


n 

MPS  =  ^  Z;  *  Dir sig(z)  *  Dir samp(z).  (2.6) 

i=i 

For  each  tumor  profile,  logistic  adjusted  Z  scores  (Z')  from  genes  (i . . .  n)  that  match 
the  direction  of  the  metastasis  gene  signature  (a  vector  of  -Is  and  +ls  representing  whether 
the  gene  was  deleted  or  amplified  in  the  signature,  resp.)  were  added,  whereas  Z'  from  genes 
that  mismatch  the  direction  of  the  signature  were  subtracted.  As  the  direction  component  of 
the  risk  model  score  (Dir)  reflects,  if  the  CNAs  of  the  metastasis  signature  (DirSig)  and  the 
unknown  sample  profile  (Dirsamp)  are  in  the  same  direction,  the  coefficient  will  be  1;  if  they 
are  in  opposing  directions,  the  coefficient  will  be  -1;  and  if  Dirsamp  (z)  =  0,  then  the  entire 
term  will  not  count  towards  the  score.  For  example,  if  a  gene  i,  that  is  typically  amplified  in 
metastases  (DirSig  (z))  and  mPTs,  is  also  amplified  in  the  unknown  profile  (Dirsamp  (z))  that  Z 
score  is  added,  whereas  if  gene  i  in  the  profile  is  deleted,  as  expected  in  iPTs,  the  Z  score  is 
subtracted.  Neutral  genes  that  are  neither  amplified  nor  deleted  in  the  unknown  profile  are 
not  scored  in  this  model. 

Three  metastasis  signatures,  derived  from  a  combination  of  five  cohorts  were  used  to 
develop  the  MPS.  The  first  signature  was  identified  using  49  primary  tumors  of  unknown 
clinical  outcome  from  NYU  ( n  =  29)  and  Baylor  (n  =  20)  and  a  metastasis  cohort  from 
Hopkins  ( n  =  13).  The  other  two  signatures  were  identified  using  75%  of  the  MSK  cohort 
of  primary  tumors  of  unknown  outcome  (n  =  126)  along  with  a  set  of  metastatic  tumors 
(n  =  36)  from  the  same  MSK  cohort.  The  CNA-based  gene  signatures  from  these  2  sets  of 
cohorts  were  concatenated  and  derived  into  the  MPS  which  we  assessed  in  a  Cox  proportional 
hazard  model  with  samples  set  aside  for  testing  purposes  only.  The  test  cases  were  comprised 
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of  bona  fide  mPTs  (primary  tumors  that  later  developed  into  distant  metastasis),  whereas 
the  test  controls  were  derived  from  a  random  sample  of  tumors  with  unknown  outcome 
not  used  to  build  the  MPS.  All  presurgery  predictors  (PSA,  clinical  stage,  biopsy  Gleason) 
and  other  demographic  variables  (age  at  diagnosis  and  race)  were  tested  independently  and 
in  combination  with  the  MPS  in  Cox  proportional  hazards  survival  analysis  with  the  time 
variable  represented  by  progression  to  metastasis. 


3.  Results 

3.1.  Prediction  Models 

Our  selection  models  resulted  in  three  hundred  and  sixty-eight  genes  (from  3  metastasis 
signatures)  with  a  CNA  status  that  was  concordant  among  METS  and  mPTs  and  contrasted 
with  iPTs  (Z  >  1.7)  (Supplemental  Table  1,  see  Supplementary  Materials  available  online  at 
doi:10. 1155  /2012  /873570).  With  these  genes,  we  developed  the  MPS  and  tested  the  accuracy 
as  an  independent  predictor  of  metastasis,  with  a  subset  of  primary  tumors  (n  =  52)  not 
used  to  develop  the  signatures  ( n  -  13  mPTs  and  n  =  39  control  primary  tumors.  Table  1). 
As  a  continuous  predictor,  applying  the  MPS  to  a  Cox  proportional  hazards  model  resulted 
in  a  significant  association  to  the  endpoint  of  metastasis-free  survival  (2.88;  95%  Cl  =  1.15  — 
7.2;  P  =  0.02)  (Table  2). 

Patients  diagnosed  with  prostate  cancer  have  several  pretreatment  variables,  such 
as,  clinical  stage  (combination  of  digital  rectum  exam,  PSA,  and  ultrasound /MRI),  biopsy 
Gleason  score  and  other  demographic  measures  (e.g.,  age  or  race)  to  guide  the  decision  to 
undergo  surgery.  These  variables  have  marginal  clinical  utility  and,  in  our  cohorts,  none 
of  these  clinical  variables  were  statistically  significant  in  univariate  or  multivariate  logistic 
regression  models.  In  multivariate  Cox  regression  models  (Table  2),  only  the  MPS  score 
reached  statistical  significance,  indicating,  that  the  MPS  score  was  the  only  reproducible 
predictor  of  metastasis-free  survival. 

Notably,  the  clinical  stage  was  specific  when  palpable  tumor  was  detected  (T2  or 
greater);  however,  it  lacked  sensitivity,  because  47%  (9/19)  of  pathological  stage-4  cases 
that  evaluated  ex-vivo  were  diagnosed  as  TIC  before  surgery  [7],  Twenty-seven  percent 
(13  out  of  49)  of  clinical  stage  TIC  tumors  that  were  upstaged  following  prostatectomy 
resulted  in  distant  metastasis  formation.  Therefore,  staging  at  the  time  of  biopsy  can 
seriously  underestimate  the  severity  of  disease.  Similarly,  the  biopsy  Gleason  score  versus 
the  postsurgery  Gleason  score  was  underestimated  in  38%  of  cases  and  overestimated  in  8% 
[7]  (Figure  8). 


3.2.  Metastatic  Potential  Score  Distributions 

Significant  differences  as  measured  by  Mann-Whitney  test  of  the  MPS  were  observed  for 
the  metastasis  ( P  <  0.001)  and  mPT  ( P  =  0.001)  groups,  compared  to  the  control  primary 
tumors  (Figure  9).  The  MPS  in  the  lymph-node-positive  primary  tumors  (derived  from  the 
MSK  ( n  =  9)  and  Stanford  (n  =  9)  cohorts  did  not  differ  significantly  from  the  control  tumor 
group  (Pmsk  =  0.34,PStanford  =  0.13,  Pcombined  =  0.08),  which  reflected  the  marginal  ability  of 
this  clinical  parameter  to  predict  distant  metastasis  in  previous  reports  [15] . 

Consistent  with  our  assumption  that  the  control  cohorts  contained  a  fraction  of  mPTs, 
their  MPS  overlapped  the  MPS  range  of  the  cases.  Furthermore,  control  primary  tumors 
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Table  1:  Clinical  And  histological  characteristics  of  samples  used  to  validate  the  metastatic  potential  score 


model. 

Case 

Control 

n 

13 

39 

Age 

Mean 

59.5 

59.1 

Median 

61 

58 

Standard  deviation 

7.1 

7.3 

Range 

46-67 

46-73 

Race 

Asian 

0  (0%) 

1  (1.9%) 

Black 

1  (1.9%) 

4  (7.7%) 

Unknown 

0  (0%) 

2  (3.8%) 

White  Non-Hispanic 

12  (23.1%) 

32  (61.5%) 

Clinical  stage 

TIC 

4  (7.7%) 

23  (44.2%) 

T2 

5  (9.6%) 

16  (30.8%) 

T3 

4  (7.7%) 

0  (0%) 

T4 

0  (0%) 

0  (0%) 

Biopsy  Gleason  score 

5 

0  (0%) 

0  (0%) 

6 

4  (7.7%) 

26  (50%) 

7 

7  (13.5%) 

10  (19.2%) 

8 

2  (3.8%) 

2  (3.8%) 

9 

0  (0%) 

1  (1.9%) 

Prediagnosis  biopsy  PSA  (ng/mL) 

Median 

6.9 

5.6 

<4 

2  (3.8%) 

6  (11.5%) 

4-10 

6  (11.5%) 

24  (46.2%) 

>10 

4  (7.7%) 

7  (13.5%) 

Pretreatment  PSA  (ng/mL) 

Median 

12.8 

5.6 

<4 

2  (3.8%) 

7  (13.5%) 

4-10 

4  (7.7%) 

26  (50%) 

>10 

7  (13.5%) 

6  (11.5%) 

Table  2:  Cox  proportional  hazards  model 

analysis  of  the  metastatic  potential 

score  and  clinical  predictors. 

Component 

Hazard  ratio 

P 

95%  Cl 

Univariate 

MPS 

2.87 

0.02 

1. 2-7.2 

Pretreatment  PSA 

1.00 

0.04 

1.0-1.1 

Clinical  stage  T2-T3 

1.27 

0.70 

0.4-A.2 

Multivariate 

MPS 

2.61 

0.05 

1. 0-6.8 

Clinical  stage  T2-T3 

0.90 

0.87 

0.3-3.1 

Pretreatment  PSA 

1.00 

0.18 

1.0-1.0 
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Figure  8:  Biopsy  versus  pathology  Gleason  score.  The  difference  between  the  Gleason  score  as  measured 
from  a  biopsy  of  the  tumor  relative  to  the  pathological  assessment  of  the  score  using  the  radical 
prostatectomy  surgical  specimen  (y-axis).  The  x-axis  represents  the  sample  index. 
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Figure  9:  Boxplot  showing  the  metastatic  potential  scores  for  all  samples  involved  in  the  analysis.  All 
high-risk  tumors  are  shown  in  the  left  three  boxes  (metastases,  progressors,  and  lymph-node-positive 
samples),  while  unknown  control  primary  tumors  and  the  publically  available  cell  line  data  are  shown 
in  the  right  boxes.  The  red  "+"  symbols  in  the  lymph-node-positive  box  represent  those  samples  from 
the  MSK  dataset,  distinguishing  them  from  the  SU  cohort  lymph-node-positive  samples.  The  green  "x" 
symbols  in  the  control  primary  tumors  plot  represent  selected  low-risk  primary  tumors  (individuals  with 
no  biochemical  recurrence  (PSA)  for  at  least  80  months). 


(n  =  49)  (n  =  13)  (n  =  18)  (n  =  263)  (n  =  337) 


(from  MSK  cohort)  that  did  not  recur  biochemically  (as  measured  by  PSA)  after  80  months 
of  followup,  (represented  by  green  Xs  in  Figure  9)  were  not  significantly  correlated  with  the 
MPS.  To  determine  whether  other  cancer  types  exhibited  a  similar  metastatic  landscape  of 
CNAs  to  that  observed  in  prostate  cancer,  we  calculated  the  metastatic  potential  score  for  337 
cancer  cell  lines.  We  observed  an  overall  distribution  that  overlapped  with  low-risk  prostate 
primary  tumors  (Figure  9) .  However,  22  of  the  337  cell  lines  ranked  by  MPS  were  above  the 
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75th  percentile  of  the  prostate  primary  tumors  and  metastases.  These  cell  lines  originated 
from  tumors  of  the  lung  (n  =  10),  breast  (n  =  3),  colon  (n  =  2),  and  melanoma  (n  =  2).  Other 
singletons  in  this  group  of  22  cell  lines  originated  from  thyroid,  rectum,  pharynx,  pancreas, 
and  kidney. 


3.3.  Biomarker  Functional  Significance 

Another  way  to  validate  our  algorithms  is  by  data  mining  the  functional  attributes  of 
the  metastasis  genes  identified  by  the  selection  model.  As  expected,  many  of  the  top- 
ranking  metastasis  genes  identified  have  molecular  functions  related  to  alteration  of  nuclear 
and  extracellular  matrix  structure  and  metabolic  modification  that  enhance  processes 
characteristic  of  escape  from  anoikis  (a  key  metastasis  specific  process) .  A  heat  map  of  the 
CNA  events  of  signature  genes  for  all  prostate  tumors  is  suggestive  of  a  path  toward  the 
different  high  frequency  amplification  versus  deletion  events  that  contrast  the  high-risk  and 
low-risk  tumors  (Figure  10) .  The  mid-risk  region  with  its  relative  paucity  of  signature  events 
may  represent  the  starting  point  of  two  alternative  pathways  of  subsequent  copy  number 
alteration,  one  leading  to  metastasis  and  the  other  to  an  indolent  state.  The  locking  in  of 
these  "antimetastasis"  events  in  indolent  tumors  may  explain  why  they  failed  to  metastasize 
despite  extended  periods  of  watchful  waiting. 

One  of  the  top  predictor  genes,  the  solute  carrier  family  SLC7A5  gene,  deleted  on 
chromosome  16q24.2,  encodes  a  neutral  aminoacid  transporter  protein  (LAT1)  that  has 
been  implicated  in  multiple  cancers  (prostate  [16],  breast  [17],  ovarian  [18],  lung  [19],  and 
brain  [20])  and  has  been  shown  to  have  utility  as  a  diagnostic  [21-23]  and  drug  target 
in  cell  line  [24-26]  and  preclinical  animal  models  [27],  The  normal  function  of  LAT1  is 
to  regulate  cellular  aminoacid  concentration,  L-glutamine  (efflux)  and  L-leucine  (influx). 
Reduced  activity  of  LAT1  results  in  increased  concentrations  of  L-glutamine  which  has  been 
shown  to  constitutively  fuel  mTOR  activity  [28].  Seven  other  solute  carrier  superfamily 
members  (SLC05A1,  SLC7A2,  SLC10A5,  SLC26A7,  SLC25A37,  SLC38A8,  and  SLC39A14) 
were  predictive  of  metastatic  potential  in  our  models,  likely  creating  a  cellular  environment 
conducive  to  metastasis. 

A  second  subset  of  signature  genes  included  6  Cadherin  family  members  encoding 
calcium  dependent  cell  adhesion  glycoproteins  (CDH2,  CDH8,  CDH13,  CDH15,  CDH17, 
and  PCDH9).  Many  of  the  Cadherin  family  proteins  have  putative  functions  associated  with 
metastasis  progression  [29]  and  have  been  included  in  diagnostic  panels  [30,  31]. 

A  third  subset  of  5  genes  predicted  to  contribute  to  metastatic  potential  were 
potassium  channels,  KCNB2,  KCNQ3,  KCNAB1,  KCTD8,  and  KCNH4.  Notably,  3  other 
potassium  channels  reside  in  the  highly  amplified  region  between  8ql3  and  8q24  (KCNS2, 
KCNV1  and  KCNK9)  that  did  not  rank  high  in  our  analysis  but  may  have  weak  or  modifier 
effects.  High  levels  of  cytoplasmic  potassium  ion  concentrations  have  been  shown  to  inhibit 
the  hallmark  mitochondrial  apoptotic  cascade  of  membrane  disruption  and  ensuing  release  of 
cytochrome  C,  caspase,  and  nuclease  degradation  of  cellular  components  [32],  Furthermore, 
another  study  showed  that  the  methylation  status  of  potassium  channel,  KCNMA1  (10q22.3), 
was  predictive  of  prostate  cancer  recurrence  [33].  The  activity  of  voltage-gated  potassium 
channels  in  prostate  cancer  cell  lines,  LNCaP  (low  metastatic  potential)  and  PC3  (high 
metastatic  potential),  were  observed  to  be  markedly  different  [34],  The  complete  set  of 
metastasis  signature  genes  likely  represents  various  subsets  of  functions.  Representation 
of  different  gene  family  members  suggests  that  each  tumor  may  have  a  unique  profile  to 
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Figure  10:  Heatmap  showing  copy  number  amplifications  and  deletions  for  tumor  samples  in  the  gene 
signature.  The  genes  are  arranged  in  genomic  order;  position  is  indicated  by  the  colored  bar  on  the  right. 
The  tumor  samples  (x-axis)  are  arranged  by  subtype  (metastatic  (METS),  progressors  (mPTs),  and  control 
primary  tumors)  and  further  sorted  by  their  metastatic  potential  score.  A  strong  pattern  emerges  in  the 
metastasis  samples  on  the  left  and  is  shared  by  the  progressors  and  high-risk  primary  tumors.  Further 
towards  the  right,  the  metastatic  pattern  diminishes  and  even  shows  a  reversal  in  copy  number  pattern  in 
some  chromosomal  areas. 


progress  to  metastasis,  yet  different  members  of  a  gene  family  may  contribute  to  a  functional 
redundancy.  Notably,  the  genomic  DNA  landscape  around  the  androgen  receptor  locus  on 
chromosome  X  represents  a  compelling  observation  linking  CNAs  to  a  functional  cause  and 
effect  response  of  androgen  ablation  therapy  (Figure  5). 


4.  Summary 

In  this  study,  we  developed  a  semisupervised  clustering  algorithm  that  can  infer  the 
classification  of  a  primary  tumor  based  on  metastatic  risk.  This  was  essential  to  overcome 
the  limitations  inherent  to  prostate  cancer  cohorts  for  collecting  long-term  clinical  outcome 
data.  Our  novel  approach  to  modeling  the  CNA  data  based  on  Darwin's  evolutionary 
selection  theory  allowed  us  to  identify  genes  associated  with  the  specific  metastatic  processes 
of  anoikis.  Current  clinical  models  for  assessing  risk  are  aimed  at  predicting  biochemical 
recurrence,  rather  than  metastasis,  and  do  not  include  genomic  information.  This  limitation 
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was  underscored  in  a  study  with  a  large  cohort  of  greater  than  10,000  men  who  had 
undergone  radical  prostatectomy  [35].  Within  that  cohort,  about  20%  of  men  developed 
biochemical  recurrence  within  5  years  of  the  procedure,  but  subsequently  only  10%  of  the 
men  with  biochemical  recurrence  developed  distant  metastases  after  12  years. 

This  proposed  new  classification  method  and  selection  model  allowed  us  to  develop 
a  metastatic  potential  score  that  could  be  used  for  predicting  an  individual's  metastasis-free 
survival  at  the  time  of  diagnosis.  With  validation  in  additional  cohorts  and  statistical  models 
with  known  metastasis  outcome,  this  approach  may  lead  to  a  significant  advancement  in 
determining  whether  aggressive  treatment  of  prostate  cancer  is  necessary.  This  predictor 
might  be  important  for  correctly  categorizing  men  at  the  time  of  diagnosis  and  could 
predict  whether  surgery,  radiation  therapy,  or  watchful  waiting  was  warranted.  Because 
the  proposed  tool,  tumor  genomic  analysis,  is  comprehensive  for  identifying  the  genetic 
changes  that  are  associated  with  the  pathogenesis  of  metastasis,  there  is  a  greater  likelihood 
of  selecting  a  sufficient  number  of  markers  that  are  both  sensitive  and  specific  predictors.  This 
method  could  be  applied  to  other  cancers  (e.g.,  breast)  that  exhibit  variation  in  the  metastatic 
potential  of  the  primary  tumor  and  have  similar  difficulties  in  collecting  tumor  samples  with 
long-term  clinical  outcome  data. 
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Genomic  Signatures  of  Metastasis  in  Prostate  Cancer 


FIELD  OF  THE  DISCLOSURE 

[0001]  This  disclosure  relates  to  metastatic  gene  signatures.  More  particularly,  this 
disclosure  has  identified  copy  number  alterations  (CNAs)  around  genes  that  are  over¬ 
represented  in  metastases,  which  serve  as  the  basis  for  predicting  whether  a  primary  tumor 
will  metastasize. 

BACKGROUND  ART 

[0002]  Prostate  cancer  is  a  common  public  health  problem.  In  2010,  this  disease  was 
diagnosed  in  an  estimated  217,730  men  (28%  of  all  male  cancers)  and  resulted  in  32,050 
deaths  (1 1%  of  male  cancer  deaths)  (Jemal  et  al.,  CA  Cancer  J  Clin  59(4):225-49  (2009)).  If 
left  untreated,  the  majority  of  prostate  cancers  remain  asymptomatic  and  indolent  for  decades 
(Klotz  et  al.,  Journal  of  Clinical  Oncology  (2010)  28:126-31).  If  treated  with  radical 
prostatectomy  or  radiation  therapy,  the  risk  of  metastasis  is  reduced,  but  erectile  dysfunction, 
urinary  incontinence  and  rectal  bleeding  may  occur,  affecting  the  patient's  quality  of  life. 
Because  it  is  currently  difficult  to  determine  accurately  which  patients  will  develop  metastatic 
disease,  physicians  treat  patients  with  mid-to-late  stage  local  disease  aggressively,  even  when 
such  treatment  may  not  be  required.  Clinical  parameters,  such  as  serum  concentration  of 
prostate  specific  antigen  (PSA),  extension  beyond  surgical  margins,  invasion  of  seminal 
vesicles,  extension  beyond  the  capsule,  Gleason  score,  prostate  weight,  race  and  year  of 
surgery,  are  employed  in  existing  nomograms  for  prediction  of  local  recurrences  (Ohori  et 
al.,  Mod  Pathol  17(3):  349-359  (2004)),  but  local  recurrence  and,  therefore,  these 
parameters  have  limited  utility  for  predicting  progression  of  the  disease  to  distant  sites 
(Nakagawa  et  al.,  PLoS  One  3(5):e23 18  (2008)).  Development  of  a  robust  risk  model  that 
accurately  predicts  the  potential  of  a  local  prostate  cancer  to  metastasize  would  justify 
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aggressive  treatment  in  high-risk  cases  and  improve  the  quality  of  life  for  men  with  indolent 
disease. 

SUMMARY  OF  THE  DISCLOSURE 

[0003]  This  disclosure  is  directed  to  a  method  of  determining  the  risk  of  metastasis  of 
prostate  cancer  in  a  human  subject  who  has  or  had  prostate  cancer.  The  method  is  premised 
in  identification  of  metastatic  signature  genes  and  genomic  regions  whose  copy  number 
alterations  are  overrepresented  in  metastases. 

[0004]  In  one  embodiment,  a  metastatic  gene  signature  set  includes  at  least  the  top  80 
genes  and  genomic  regions  shown  in  Table  6.  In  another  embodiment,  a  metastatic  gene 
signature  set  includes  at  least  the  top  40  genes  and  genomic  regions  shown  in  Table  6.  In  still 
another  embodiment,  a  metastatic  gene  signature  set  includes  at  least  the  top  20  genes  and 
genomic  regions  shown  in  Table  6.  In  yet  another  embodiment,  a  metastatic  gene  signature 
set  includes  at  least  the  top  12  genes  and  genomic  regions  shown  in  Table  6. 

[0005]  In  a  specific  embodiment,  the  method  disclosed  herein  includes  determining  in  a 
prostate  sample  from  the  subject  the  number  of  copies  per  cell  of  at  least  12  genes  and/or 
genomic  regions  of  a  metastatic  gene  signature  set  which  consists  of  the  top  20  genes  and 
gene  regions  listed  in  Table  6;  determining  alternations  in  the  number  of  copies  per  cell  for 
each  of  the  at  least  12  genes  and/or  genomic  regions  as  compared  to  the  number  of  copies  per 
cell  in  non-cancer  cells;  and  determining  the  risk  of  prostate  cancer  metastasis  based  on  the 
copy  number  alternations  (CNAs)  determined. 

[0006]  In  one  embodiment,  the  at  least  12  genes  and/or  genomic  regions  being  analyzed 
are  the  top  12  genes  and  genomic  regions,  namely,  the  PPP3CC  genomic  region,  the 
SLC05A1  genomic  region,  the  SLC7A5  genomic  region,  the  SLC7A2  genomic  region,  the 
CRISPLD2  genomic  region,  the  CDH13  gene,  the  CDH8  gene,  the  CDH2  gene,  the  ASAH1 
genomic  region,  the  KCNB2  genomic  region,  the  KCNH4  genomic  region,  and  the  CTD8 
gene. 
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[0007]  In  another  embodiment,  the  at  least  12  genes  and/or  genomic  regions  being 
analyzed  include  all  of  the  top  20  genes  and  genomic  regions  listed  in  Table  6,  namely,  the 
PPP3CC  genomic  region,  the  SLC05A1  genomic  region,  the  SLC7A5  genomic  region,  the 
SLC7A2  genomic  region,  the  CRISPLD2  genomic  region,  the  CDH13  gene,  the  CDH8  gene, 
the  CDH2  gene,  the  ASAH1  genomic  region,  the  KCNB2  genomic  region,  the  KCNH4 
genomic  region,  the  CTD8  gene,  the  JPH1  genomic  region,  the  MEST  genomic  region,  the 
NCALD  genomic  region,  the  COL19A1  gene,  the  MAP3K7  genomic  region,  the  YWHAG 
gene,  the  NOL4  genomic  region,  and  the  ENOX1  gene. 

[0008]  According  to  the  method  disclosed  herein,  an  increase  in  the  copy  number  per  cell 
for  any  of  the  SLC05A1  genomic  region,  the  KCNB2  genomic  region,  the  KCNH4  genomic 
region,  the  JPH1  genomic  region,  the  NCALD  genomic  region,  or  the  YWHAG  gene, 
correlates  with  an  increased  risk  of  prostate  cancer  metastasis;  and  a  decrease  in  the  copy 
number  per  cell  for  any  of  the  PPP3CC  genomic  region,  the  SLC7A5  genomic  region,  the 
SLC7A2  genomic  region,  the  CRISPLD2  genomic  region,  the  CDH13  gene,  the  CDH8  gene, 
the  CDH2  gene,  the  ASAH1  genomic  region,  the  CTD8  gene,  the  MEST  genomic  region,  the 
COL19A1  gene,  the  MAP3K7  genomic  region,  the  NOL4  genomic  region,  or  the  ENOX1 
gene,  correlates  with  an  increased  risk  of  prostate  cancer  metastasis. 

[0009]  The  copy  number  of  a  gene  or  genomic  region  can  be  determined  using  a  nucleic 
acid  probe  that  hybridizes  to  the  gene  or  genomic  region  in  the  genomic  DNA  present  in  the 
sample.  Hybridization  can  be  performed  in  an  array  format,  for  example. 

[0010]  The  risk  of  metastasis  can  be  determined  based  on  calculating  a  metastatic 
potential  score: 


M(SM)  =  Zadjust,  *  Dirsig  (?)  *  Dirsamp(i ) 

/ 

wherein  the  logistic  adjusted  Z-scores  ( Zadjust )  for  each  of  the  genes  of  the  metastatic 
signature  set  are  set  forth  in  Table  6  and  wherein  if  the  CNAs  of  the  signature  and  the  sample 
are  in  the  same  direction,  the  coefficient  (Dir)  will  be  1;  if  they  are  in  opposite  directions,  the 
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coefficient  will  be  -1;  and  if  no  alternation  in  copy  number  is  detected  for  a  gene,  the 
coefficient  for  that  gene  =  0;  and  comparing  the  metastatic  potential  score  to  a  control  value, 
wherein  an  increase  in  the  score  correlates  with  an  increased  risk  of  metastasis. 

[0011]  Further  disclosed  herein  are  diagnostic  kits  for  performing  the  method  of 
determining  the  risk  of  metastasis  of  prostate  cancer.  The  kits  can  include  nucleic  acid  probes 
that  bind  to  one  or  more  metastatic  signature  genes  and  genomic  regions  disclosed  herein,  and 
other  assay  reagents.  The  nucleic  acid  probes  can  be  provided  on  a  solid  support  such  as  a 
microarray  slide.  The  kits  can  also  include  other  materials  such  as  instructions  or  protocols 
for  performing  the  method. 

BRIEF  DESCRIPTION  OF  THE  DRAWINGS 

[0012]  Figure  1.  Boxplot  showing  the  metastatic  potential  scores  for  all  samples 
involved  in  the  analysis.  All  high-risk  tumors  are  shown  in  the  left  three  boxes  (metastases, 
primary  tumors  that  progressed  to  metastasis,  and  lymph  node  positive  primary  tumors),  whereas 
unknown  control  primary  tumors  and  the  publically  available  cell  line  data  are  shown  in  the 
right  boxes.  The  "+"  symbols  in  the  lymph  node  positive  box  represent  those  samples  from 
the  MSK  dataset  and  indicate  that  there  is  no  difference  between  the  two  lymph  node  positive 
cohorts.  The  "x"  symbols  in  the  control  primary  tumors  plot  represent  selected  low-risk 
primary  tumors  (individuals  with  no  biochemical  recurrence  (PSA)  for  at  least  80  months). 

[0013]  Figure  2.  Left  graph,  ROC-curve  for  prediction  of  primary  tumors  that  progressed 
to  metastasis  using  the  metastatic  potential  score.  The  model  used  to  make  this  prediction 
was  run  using  a  random  75%  of  samples  from  the  data,  whereas  the  prediction  was  run  using 
the  remaining  25%  (13  known  mPTs  and  39  control  primary  tumors).  The  random  model  is 
indicated  by  the  diagonal  line  (AUC  =  0.5).  The  crosshair  indicates  the  cut  point  used  to 
separate  the  data  for  survival  analysis  (shown  in  the  right  graph).  Right  graph,  Kaplan-Meier 
survival  curve  showing  metastasis-free  probability.  The  data  were  split  in  half  by  metastatic 
potential  score  and  progression  status  and  follow-up  time  were  assessed.  Log  rank  test  (p- 
value)  compares  the  high-risk  and  low-risk  sample  groups. 
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[0014]  Figure  3.  Simulation  of  a  subset  of  genes  were  sampled  (n=20)  and  the  genes  that 
were  over  represented  in  the  region  where  the  AUC  and  r2  were  maximized  (box)  were 
ranked  by  their  frequency.  This  simulation  was  also  performed  for  n=  40,  50,  80,  and  100 
genes. 

[0015]  Figure  4.  Extending  window-AUC  (red),  extending  window-  r2  (black)  based  on 
the  sorted  hierarchy  of  genes. 

[0016]  Figure  5.  ROC  curve  (left  panel).  Kaplan-Meier  depiction  of  Cox  proportional 
hazards  model  (right  panel). 

DETAILED  DESCRIPTION 

[0017]  This  disclosure  provides  a  risk  model  that  reliably  predicts  those  tumors  that  are 
likely  to  metastasize,  while  minimizing  the  false  positive  rate  and  increasing  the  specificity  of 
treatment  decisions. 

[0018]  The  risk  model  has  been  developed  through  the  identification  of  copy  number 
alterations  (CNAs)  around  genes  that  were  over-represented  in  metastases  and  primary  tumors 
that  later  progressed  to  metastases.  These  CNAs  are  predictive  of  whether  a  primary  tumor 
will  metastasize.  Cross-validation  analysis  has  revealed  a  predictive  accuracy  of  80.5%  and 
log  rank  analysis  of  the  metastatic  potential  score  has  been  shown  to  be  significantly  related  to 
the  endpoint  of  metastasis-free  survival  (p=0.014).  In  contrast  to  other  reported  risk  models, 
the  risk  model  disclosed  herein  based  on  the  study  of  CNAs  predicts  distant  metastasis 
progression  as  the  clinical  endpoint  without  the  use  of  intermediate  endpoints  (such  as 
biochemical  markers  of  progression).  The  hierarchy  of  the  genes  and  genomic  regions  that 
contribute  to  the  prediction  of  metastatic  potential  has  also  been  determined. 

[0019]  Accordingly,  disclosed  herein  is  a  method  for  determining  the  risk  of  metastasis  of 
prostate  cancer  in  a  human  subject  who  has  or  had  prostate  cancer.  This  method  is  based  on 
determining  in  a  prostate  sample  from  the  subject,  copy  number  alterations  (CNAs)  of  genes 
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and  genomic  regions  of  a  metastatic  gene  signature  set,  and  correlating  the  CNAs  with  a  risk 
of  prostate  cancer  metastasis. 

[0020]  Metastatic  Gene  Signature 

[0021]  Metastatic  gene  signatures  have  been  developed  by  the  present  inventors  from 
studies  of  the  genomic  landscape  of  copy  number  alterations  in  294  primary  prostate  tumors 
and  49  prostate  metastases  from  5  independent  cohorts,  as  described  in  more  detail  in  the 
examples  hereinbelow.  368  copy  number  alterations  have  been  identified  around  genes  that 
are  over-represented  in  metastases  and  are  predictive  of  whether  a  primary  tumor  will 
metastasize.  Cross-validation  analysis  has  revealed  a  prediction  accuracy  of  80.5%. 

[0022]  Accordingly,  in  one  embodiment,  this  disclosure  provides  a  metastatic  gene 
signature  set  which  includes  the  368  genes  identified  herein,  set  forth  in  Table  6. 

[0023]  As  displayed  in  Table  6,  the  368  genes  include  a  number  of  "clumps",  each  clump 
identified  by  a  "Clump  Index  Number".  A  "clump",  as  used  herein,  refers  to  a  group  of  genes 
that  are  adjacent  to  one  another  on  the  chromosome,  and  copy  number  alterations  are  detected 
for  the  genomic  region  which  includes  this  group  of  genes  in  connection  with  prostate  cancer 
metastasis.  A  multi-member  clump  may  include  both  drivers  (genes  that  cause  or  more 
directly  associate  with  metastasis)  and  passengers  (genes  that  indirectly  associate  with 
metastasis  because  of  its  close  proximity  of  a  metastasis  driver  gene). 

[0024]  The  term  "genomic  region"  is  used  herein  interchangeably  with  the  term  "clump", 
and  is  typically  used  herein  in  conjunction  with  the  name  of  a  member  gene  within  the 
genomic  region  or  clump.  For  example,  the  PP3CC  gene  listed  in  the  first  row  of  Table  6 
belongs  to  Clump  Index  26,  which  also  includes  the  genes  KIAA1967,  BIN3,  SORBS3, 
PDLIM2,  RHOBTB2,  SLC39A14,  EGR3,  and  C8orf58.  Therefore,  Clump  Index  26  is  also 
referred  to  herein  as  "the  PP3CC  genomic  region". 

[0025]  While  many  of  the  368  genes  belong  to  clumps,  some  of  the  genes  do  not  belong 
to  any  clump  and  copy  number  alterations  have  been  identified  specifically  around  each  of 
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these  genes  in  connection  with  metastasis  of  prostate  cancer.  For  example,  as  shown  in  Table 
6  (with  “NA”  in  the  Clump  Index  column),  CDH13,  CDH8,  CDH2  CTD8,  COL19A1, 
YWHAG,  and  ENOX1,  among  many  others,  are  genes  which  do  not  belong  to  any  clump. 

[0026]  In  other  embodiments,  this  disclosure  provides  smaller  metastatic  gene  signature 
sets  which  include  at  least  80,  at  least  40,  at  least  20,  or  at  least  12,  non-overlapping  genes 
and/or  genomic  regions  listed  in  Table  6. 

[0027]  By "  non-overlapping"  it  is  meant  that  the  genes  selected  to  constitute  a  smaller 
signature  set  do  not  belong  to  the  same  genomic  region  or  clump. 

[0028]  As  described  in  more  detail  in  the  examples  hereinbelow,  the  metastatic  potential 
score  derived  from  the  complete  set  of  368  genes  resulted  in  a  predictive  accuracy  of  AUC  = 
81%.  The  hierarchy  of  the  genes  that  contribute  to  this  prediction  has  been  determined,  as 
shown  in  Table  6,  based  on  a  procedure  that  sought  to  identify  genes  that  maximize  the 
prediction  accuracy  (AUC  =  81%)  and  also  maximize  the  regression  coefficient  between  the 
metastatic  potential  scores  from  the  368  genes  versus  any  iteration  of  the  randomly  sampled 
subset  of  genes. 

[0029]  Accordingly,  in  one  embodiment,  a  metastatic  gene  signature  set  includes  at  least 
the  top  80  genes  and  genomic  regions  shown  in  Table  6. 

[0030]  In  another  embodiment,  a  metastatic  gene  signature  set  includes  at  least  the  top  40 
genes  and  genomic  regions  shown  in  Table  6. 

[0031]  In  still  another  embodiment,  a  metastatic  gene  signature  set  includes  at  least  the 
top  20  genes  and  genomic  regions  shown  in  Table  6. 

[0032]  In  yet  another  embodiment,  a  metastatic  gene  signature  set  includes  at  least  the  top 
12  genes  and  genomic  regions  shown  in  Table  6. 

[0033]  Determination  of  Copy  Number  Alterations  (CNAs) 
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[0034]  A  copy  number  alteration  is  a  variation  in  the  number  of  copies  of  a  gene  or 
genomic  region  present  in  the  genome  of  a  cell.  A  normal  diploid  cell  typically  has  two 
copies  of  each  chromosome  and  the  genes  contained  therein.  Copy  number  alterations  may 
increase  the  number  of  copies,  or  decrease  the  number  of  copies. 

[0035]  The  direction  of  copy  number  alteration  for  each  of  the  368  metastatic  signature 
genes  associated  with  metastasis  is  identified  in  Table  6  as  -1  or  1,  representing  deletions  and 
amplifications,  respectively.  For  example,  for  the  PP3CC  genomic  region  (Clump  Index  26), 
identified  as  "-1"  in  Table  6,  deletions  of  this  genomic  region  are  overrepresented  in 
metastatic  prostate  cancer  or  primary  prostate  cancers  that  later  progressed  to  metastases,  and 
are  therefore  indicative  of  a  higher  risk  of  metastasis  of  prostate  cancer.  Other  genes  and 
genomic  regions  whose  deletions  are  predictive  of  a  higher  risk  of  metastasis  of  prostate 
cancer  include,  for  example,  the  SLC7A5  genomic  region,  the  SLC7A2  genomic  region,  the 
CRISPLD2  genomic  region,  the  CDH13  gene,  the  CDH8  gene,  the  CDH2  gene,  the  ASAH1 
genomic  region,  the  CTD8  gene,  the  MEST  genomic  region,  the  COL19A1  gene,  the 
MAP3K7  genomic  region,  the  NOL4  genomic  region,  and  the  ENOX1  gene.  On  the  other 
hand,  for  the  SLC05A1  genomic  region  (Clump  Index  33),  identified  as  "1"  in  Table  6, 
amplifications  of  this  genomic  region  are  overrepresented  in  metastatic  prostate  cancer  or 
primary  prostate  cancers  that  later  progressed  to  metastases,  and  are  therefore  indicative  of  a 
higher  risk  of  metastasis  of  prostate  cancer.  Other  genes  and  genomic  regions  whose 
amplifications  are  indicative  of  a  higher  risk  of  metastasis  of  prostate  cancer  include,  for 
example,  the  KCNB2  genomic  region,  the  KCNH4  genomic  region,  the  JPH1  genomic 
region,  the  NCALD  genomic  region,  and  the  YWHAG  gene. 

[0036]  To  determine  whether  there  is  any  copy  number  alteration  for  a  given  gene  or 
genomic  region,  a  prostate  sample  is  obtained  from  a  subject  of  interest.  A  prostate  sample 
refers  to  a  cell  or  tissue  sample  taken  from  the  prostate  of  a  subject  of  interest  which  sample 
contains  genomic  DNA  to  be  analyzed  for  CNAs.  Methods  of  procuring  cell  and  tissue 
samples  are  well  known  to  those  skilled  in  the  art,  including,  for  example,  tissue  sections, 
needle  biopsy,  surgical  biopsy,  and  the  like.  For  a  cancer  patient,  cells  and  tissue  can  be 
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obtained  from  a  tumor.  A  cell  or  tissue  sample  can  be  processed  to  extract,  purify  or  partially 
purify,  or  enrich  or  amplify  the  nucleic  acids  in  the  sample  for  further  analysis. 

[0037]  Nucleic  acid  probes  are  designed  based  on  the  genes  and  genomic  regions  of  a 
metastatic  signature  gene  set  which  permit  detection  and  quantification  of  CNAs  in  the  genes 
and  genomic  regions. 

[0038]  In  one  embodiment,  the  probes  are  composed  of  a  collection  of  nucleic  acids  that 
specifically  hybridize  to  the  full  set  of  368  genes  of  the  metastatic  signature  gene  set. 

[0039]  In  another  embodiment,  the  probes  are  composed  of  a  collection  of  nucleic  acids 
that  specifically  hybridize  to  the  top  80  genes  and  genomic  regions  shown  in  Table  6. 

[0040]  In  still  another  embodiment,  the  probes  are  composed  of  a  collection  of  nucleic 
acids  that  specifically  hybridize  to  the  top  40  genes  and  genomic  regions  shown  in  Table  6. 

[0041]  In  yet  another  embodiment,  the  probes  are  composed  of  a  collection  of  nucleic 
acids  that  specifically  hybridize  to  the  top  20  genes  and  genomic  regions  shown  in  Table  6. 

[0042]  In  a  further  embodiment,  the  probes  are  composed  of  a  collection  of  nucleic  acids 
that  specifically  hybridize  to  the  top  12  genes  and  genomic  regions  shown  in  Table  6. 

[0043]  By  "specifically  hybridize"  it  is  meant  that  a  nucleic  acid  probe  binds 
preferentially  to  a  target  gene  or  genomic  region  under  stringent  conditions,  and  to  a  lesser 
extent  or  not  at  all  to  other  genes  or  genomic  regions. 

[0044]  "Stringent  conditions"  in  the  context  of  nucleic  acid  hybridization  are  known  in  the 
art,  e.g.,  as  described  in  Sambrook,  Molecular  Cloning:  A  Laboratory  Manual  (2nd  ed.)  vol.  1- 
3,  Cold  Spring  Harbor  Laboratory,  Cold  Spring  Harbor  Press,  New  York  (1989).  Generally, 
highly  stringent  hybridization  and  wash  conditions  are  selected  to  be  about  5°C  lower  than  the 
thermal  melting  point  for  a  specific  sequence  at  a  defined  ionic  strength  and  pH.  An  example 
of  highly  stringent  hybridization  conditions  is  42°C  in  standard  hybridization  solutions.  An 
example  of  highly  stringent  wash  conditions  include  0.2  x  SSC  at  65°C  for  15  minutes.  An 
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example  of  medium  stringent  wash  conditions  is  IX  SSC  at  45°C  for  15  minutes.  An  example 
of  a  low  stringency  wash  is  4X-6X  SSC  at  room  temperature  to  40°C  for  15  minutes. 

[0045]  Nucleic  acid  probes  for  purposes  of  this  invention  should  be  at  least  15  nucleotides 
in  length  to  permit  specific  hybridization  to  a  target  gene  or  genomic  region,  and  can  be  50, 
100,  200,  400,  600,  800,  1000,  or  more  nucleotides  in  length,  or  of  a  length  ranging  between 
any  of  the  two  above-listed  values.  A  nucleic  acid  probe  designed  to  specifically  hybridize  to 
a  target  gene  can  include  the  full  length  sequence  or  a  fragment  of  the  gene.  A  nucleic  acid 
probe  designed  to  specifically  hybridize  to  a  specific  target  genomic  region  can  include  at 
least  a  fragment  of  the  genomic  region,  e.g.,  at  least  the  full  length  sequence  or  a  fragment  of 
a  gene  (any  gene)  within  the  genomic  region.  Alternatively,  a  nucleic  acid  probe  shares  at 
least  80%,  85%,  90%,  95%,  98%,  99%  or  greater  sequence  identity  with  the  target  gene  to 
permit  specific  hybridization. 

[0046]  The  hybridized  nucleic  acids  can  be  detected  by  detecting  one  or  more  labels 
attached  to  the  sample  or  probe  nucleic  acids.  The  labels  can  be  incorporated  by  a  variety  of 
methods  known  in  the  art,  and  include  detectable  labels  such  as  magnetic  beads,  a  fluorescent 
compound  (e.g.,  Texas  red,  rhodamine,  green  fluorescent  protein  and  the  like),  radio  isotope, 
enzymes,  colorimetric  labels  (e.g.,  colloidal  gold  particles).  In  other  embodiments,  the 
sample  or  probe  nucleic  acids  can  be  conjugated  with  one  member  of  a  binding  pair,  and  the 
other  member  of  the  binding  pair  is  conjugated  with  a  detectable  label.  Binding  pairs  suitable 
for  use  herein  include  biotin  and  avidin,  and  hapten  and  a  hapten-specific  antibody. 

[0047]  A  number  of  techniques  for  analyzing  chromosomal  alterations  are  well  known  in 
the  art.  For  example,  fluorescence  in-situ  hybridization  (FISH)  can  be  used  to  study  copy 
numbers  of  individual  genetic  loci  or  regions  on  a  chromosome.  See,  e.g.,  Pinkel  et  al.,  Proc. 
Natl.  Acad.  Sci.  USA  85:  9138-9142  (1988).  Comparative  genomic  hybridization  (CGH)  can 
also  be  used  to  detect  copy  number  alterations  of  chromosomal  regions.  See,  e.g.,  U.S.  Patent 
No.  7,638,278. 

[0048]  In  some  embodiments,  hybridization  is  performed  on  a  solid  support.  For 
example,  probes  that  specifically  hybridize  to  signature  genes  and  genomic  regions  can  be 
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spotted  or  immobilized  on  a  surface,  e.g.,  in  an  array  format,  and  subsequently  samples 
containing  genomic  DNA  are  added  to  the  array  to  permit  specific  hybridization. 

[0049]  Immobilization  of  nucleic  acid  probes  on  various  solid  surfaces  and  at  desired 
densities  (e.g.,  high  densities  with  each  probe  concentrated  in  a  small  area)  can  be  achieved 
by  using  methods  and  techniques  known  in  the  art.  See,  e.g.,  U.S.  Patent  7,482,123  B2. 
Examples  of  solid  surfaces  include  nitrocellulose,  nylon,  glass,  quartz,  silicones, 
polyformaldehyde,  cellulose,  cellulose  acetate;  and  plastics  such  as  polyethylene, 
polypropylene,  polystyrene,  and  the  like;  gelatins,  agarose  and  silicates,  among  others.  High 
density  immobilization  of  nucleic  acid  probes  are  used  for  high  complexity  comparative 
hybridizations  which  will  reduce  the  total  amount  of  sample  nucleic  acids  required  for  binding 
to  each  immobilized  probe. 

[0050]  In  some  embodiments,  the  arrays  of  nucleic  acid  probes  can  be  hybridized  with 
one  population  of  samples,  or  can  be  used  with  two  populations  of  samples  (one  test  sample 
and  one  reference  sample).  For  example,  in  a  comparative  genomic  hybridization  assay,  a 
first  collection  of  nucleic  acids  (e.g.,  sample  from  a  possible  tumor)  is  labeled  with  a  first 
label,  while  a  second  collection  of  nucleic  acids  (e.g.,  control  from  a  healthy  cell  or  tissue)  is 
labeled  with  a  second  label.  The  ratio  of  hybridization  of  the  nucleic  acids  is  determined  by 
the  ratio  of  the  two  labels  binding  to  each  member  in  the  array.  Where  there  are  genomic 
deletions  or  amplifications,  differences  in  the  ratio  of  the  signals  from  the  two  labels  will  be 
detected  and  provide  a  measure  of  the  copy  number. 

[0051]  Determination  of  Risk 

[0052]  Once  copy  number  alterations  for  each  of  a  metastatic  signature  gene  set  have  been 
determined,  the  risk  for  metastasis  can  be  correlated  with  the  copy  number  alterations 
detected.  An  increase  in  the  copy  number  per  cell  of  the  sample  for  one  or  more  of  the  genes 
or  genomic  regions  of  a  metastatic  signature  gene  set  disclosed  herein,  whose  amplifications 
have  been  associated  with  metastatic  prostate  cancer,  will  indicate  a  higher  risk  of  metastasis 
as  compared  to  a  control  (e.g.,  a  sample  obtained  from  a  healthy  individual)  in  which  no 
increase  in  the  copy  number  occurs.  On  the  other  hand,  a  decrease  in  the  sample  in  the  copy 
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number  for  one  or  more  of  the  genes  or  genomic  regions  of  a  metastatic  signature  gene  set 
disclosed  herein,  whose  deletions  have  been  associated  with  metastatic  prostate  cancer,  will 
indicate  a  higher  risk  of  metastasis  as  compared  to  a  control  in  which  no  decrease  in  the  copy 
number  is  observed. 

[0053]  For  example,  for  a  metastatic  signature  gene  set  composed  of  the  top  20  genes  and 
genomic  regions  listed  in  Table  6,  an  increase  in  the  copy  number  per  cell  of  the  sample  for 
all  of  the  SLC05A1  genomic  region,  the  KCNB2  genomic  region,  the  KCNH4  genomic 
region,  the  JPH1  genomic  region,  the  NCALD  genomic  region,  and  the  YWHAG  gene,  and  a 
decrease  in  the  sample  in  the  copy  number  per  cell  of  the  sample  for  all  of  the  PPP3CC 
genomic  region,  the  SLC7A5  genomic  region,  the  SLC7A2  genomic  region,  the  CRISPLD2 
genomic  region,  the  CDH13  gene,  the  CDH8  gene,  the  CDH2  gene,  the  ASAH1  genomic 
region,  the  CTD8  gene,  the  MEST  genomic  region,  the  COL19A1  gene,  the  MAP3K7 
genomic  region,  the  NOL4  genomic  region,  and  the  ENOX1  gene,  correlate  with  an  increased 
risk  of  prostate  cancer  metastasis.  However,  it  is  not  necessary  for  all  the  genes  and  genomic 
regions  within  a  signature  set  to  change  in  the  same  direction  as  set  forth  in  Table  6  in  order  to 
have  a  reasonably  reliable  prediction  of  the  risk.  That  is,  an  increased  risk  can  be  predicted 
based  on  an  increase  in  the  copy  number  per  cell  of  the  sample  for  one  or  more,  preferably  a 
plurality  of,  the  SLC05A1  genomic  region,  the  KCNB2  genomic  region,  the  KCNH4 
genomic  region,  the  JPH1  genomic  region,  the  NCALD  genomic  region,  and  the  YWHAG 
gene,  and/or  a  decrease  in  the  sample  in  the  copy  number  per  cell  of  the  sample  for  one  or 
more,  preferably  a  plurality  of,  the  PPP3CC  genomic  region,  the  SLC7A5  genomic  region,  the 
SLC7A2  genomic  region,  the  CRISPLD2  genomic  region,  the  CDH13  gene,  the  CDH8  gene, 
the  CDH2  gene,  the  ASAH1  genomic  region,  the  CTD8  gene,  the  MEST  genomic  region,  the 
COL19A1  gene,  the  MAP3K7  genomic  region,  the  NOL4  genomic  region,  or  the  ENOX1 
gene.  By  "plurality"  it  is  meant  at  least  10,  11,  12,  13,  14,  15,  16,  17,  18,  19  or  20  of  the  top 
20  genes  and  gene  regions  listed  in  Table  6. 

[0054]  This  disclosure  also  provides  a  quantitative  measure  of  the  risk  based  on  the  copy 
number  alterations  of  a  signature  gene  set  disclosed  herein.  More  specifically,  the  risk  of 
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metastasis  has  been  found  to  correlate  with  a  metastatic  potential  score  calculated  based  on 
the  formula: 


M (SM)  =  £  Zadjust ,  *  Dirslg (/)  *  Dirsamp(i) 

i 

[0055]  That  is,  for  a  particular  gene  or  genomic  region,  if  the  CNA  of  the  signature  and 
the  sample  are  in  the  same  direction  (amplified  or  deleted),  the  coefficient  will  be  1,  the 
logistic  adjusted  Z-score  ( Zadjust )  for  this  gene  or  genomic  region  will  be  added;  if  in 
opposing  directions,  the  coefficient  will  be  -1,  the  logistic  adjusted  Z-score  ( Zadjust )  for  the 
gene  or  genomic  region  will  be  substracted;  and  if  Dirsamp(i)  =  0,  then  the  entire  term  will 
not  count  towards  the  score.  Thus,  essentially,  the  logistic  adjusted  Z-scores  from  genes 
(i. .  .n)  that  match  the  metastasis  signature  are  added,  whereas  from  genes  that  mismatch  the 
signature  are  subtracted.  The  logistic  adjusted  Z-scores  ( Zadjust )  for  each  of  the  368  genes  of 
the  full  metastatic  signature  set  are  found  in  Table  6. 

[0056]  The  calculated  metastatic  potential  score  is  compared  to  a  reference  distribution  of 
samples  (the  metastatic  potential  score  determined  from  a  population  of  men  with  prostate 
cancer  with  metastasis-free  survival  clinical  outcome  information).  Such  reference 
distributions  can  be  predetermined  or  calculated  side-by-side  in  the  same  experiment  as  the 
sample  being  investigated.  Therefore,  an  increase  in  the  metastatic  potential  score  as 
compared  to  the  reference  distributions  is  correlated  with  an  increased  risk  of  metastasis  of 
prostate  cancer.  According  to  this  disclosure,  a  one-point  increase  in  the  metastatic  potential 
score  corresponds  to  an  odds  ratio  of  6.3  for  progression  to  metastasis  (p  =  0.01). 

[0057]  The  disclosed  method  for  predicting  the  likelihood  of  distant  metastases  represents 
a  significant  advancement  in  the  diagnosis  and  treatment  of  prostate  cancer.  This  predictor 
may  be  important  for  correctly  categorizing  men  at  the  time  of  diagnosis  and  can  lead  to  a 
choice  of  therapy  that  would  maximize  their  chances  of  survival  and  minimize  adverse  side 
effects  if  aggressive  treatment  can  be  avoided.  Thus,  both  treatment  outcomes  and  quality  of 
life  could  be  improved.  In  addition,  because  the  proposed  tool,  tumor  genomic  analysis,  is 
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comprehensive  for  identifying  the  genetic  changes  that  are  associated  with  pathogenesis  and 
metastases,  there  is  a  greater  likelihood  of  selecting  a  sufficient  number  of  markers  that  are 
both  sensitive  and  specific  predictors.  Furthermore,  because  these  genomic  alterations  are 
themselves  susceptible  to  manipulation  with  drugs,  radiation  or  other  therapies,  they  could 
provide  a  basis  for  assessing  intermediate  endpoints,  such  as  androgen  sensitivity  and 
response  to  radiation.  Ultimately,  copy  number  alterations  could  guide  the  development  of 
individually  tailored  therapies,  including  for  cancers  other  than  prostate. 

[0058]  Diagnostic  Kits 

[0059]  Further  disclosed  herein  are  diagnostic  kits  for  performing  the  methods  described 
herein.  The  kits  can  include  any  and  all  reagents  such  as  nucleic  acid  probes  that  bind  to  one 
or  more  metastatic  signature  genes  described  above,  and  other  assay  reagents.  The  nucleic 
acid  probes  can  be  provided  on  a  solid  support  such  as  a  microarray  slide.  The  kits  can  also 
include  other  materials  such  as  instructions  or  protocols  for  performing  the  method,  which  can 
be  provided  in  an  electronic  version,  e.g.,  on  a  compact  disk  or  the  like. 

EXAMPLES 

[0060]  The  present  description  is  further  illustrated  by  the  following  examples,  which 
should  not  be  construed  as  limiting  in  any  way.  The  contents  of  all  cited  references  (including 
literature  references,  issued  patents,  and  published  patent  applications  as  cited  throughout  this 
application)  are  hereby  expressly  incorporated  by  reference. 

[0061]  Example  1. 

[0062]  This  Example  describes  the  methods  and  sample  sources  utilized  for  developing  a 
predictive  metastasis  model. 

[0063]  PREDICTIVE  BIOMARKERS 

[0064]  The  method  chosen  for  developing  the  predictive  metastasis  model  was  the 
analysis  of  copy  number  alterations  (CNAs)  in  prostate  cancers.  These  cancers  have  been 
known  to  harbor  multiple  genomic  imbalances  that  result  from  CNAs  (Beroukhim  et  al., 
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Nature  463(7283):899-905  (2010);  Sun  et  al.,  Prostate  67(7):692-700  (2007)).  High- 
resolution  measurements  of  CNAs  have  informative  value  —  in  some  cases  providing  direct 
evidence  for  alterations  in  the  quantity  of  normal,  mutant  or  hybrid-fusion  transcripts  and 
proteins  in  the  cancer  cells.  The  resulting  RNA  transcripts  and  proteins  may  impact  the 
fitness  of  the  cell  and  provide  the  mechanisms  necessary  for  travel,  invasion  and  growth. 
From  the  multiple  CNAs  identified  in  tumors,  CNA-based  gene  signatures  were  developed  to 
predict  the  likelihood  of  a  primary  tumor  progressing  to  metastasis. 

[0065]  SAMPLES,  COHORTS  AND  DATA 

[0066]  Four  publically  available  prostate  cancer  cohorts  and  a  fifth  cohort  reported  here 
(GSE27105)  were  studied,  as  summarized  in  Table  1:  1)  294  primary  tumors  and  matched 
normal  tissue  samples  from  NYU  School  of  Medicine  (NYU  n=29),  Baylor  College  of 
Medicine  (Baylor  n=20)  (Castro  et  al.,  Neoplasia  1 1(3):305-12  (2009)),  Memorial  Sloan- 
Kettering  Cancer  Center  (MSK  n=181)  (Taylor  et  al.,  Cancer  Cell  1 8(  1):  1 1-22  (2010)),  and 
Stanford  University  (SU  n=64  (single  normal  tissue  used  to  reference  each  tumor))  (LaPointe 
et  al.,  Cancer  Res  67(18):8504-10  (2007));  2)  49  metastatic  tumors  and  matched  normal 
samples  from  Johns  Hopkins  School  of  Medicine  (Hopkins  n=13)  (Liu  et  al.,  Nat  Med 
15(5):559-65  (2009))  and  MSK  (n=36)  (Taylor  et  al.,  supra).  Normal  prostate  and  tumor 
tissues  (NYU)  were  obtained  from  the  Cooperative  Prostate  Cancer  Tissue  Resource  (Table 
2).  Array  data  from  the  four  publically  available  cohorts  (Castro  et  al.,  supra;  Taylor  et  al., 
supra;  LaPointe  et  al.,  supra;  Liu  et  al.,  supra)  were  downloaded  from  Gene  Expression 
Omnibus  (Barrett  et  al.,  Nucleic  Acids  Res  39  (Database  issue):D1005-10  (2011)) 
(GSE12702,  GSE14996,  GSE6469,  GSE21035).  A  public  cell  lines  cohort  of  various  tumor 
origins  was  obtained  from  the  ArrayExpress  database  (Parkinson  et  al.,  Nucleic  Acids  Res 
39(Database  issue):D1002-4)  (E-MTAB-38)  to  determine  if  the  gene  signature  and  predictive 
model  developed  herein  could  be  applicable  to  other  cancers. 

[0067]  SAMPLE  PROCESSING  (NYU  COHORT) 

[0068]  Genomic  DNA  (gDNA)  was  extracted  using  a  Gentra  DNA  extraction  kit 
(Qiagen).  Purified  gDNA  was  hydrated  in  reduced  TE  buffer  (10  mM  Tris,  0.1  mM  EDTA, 
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pH  8.0).  The  gDNA  concentration  was  measured  using  the  NanoDrop™  2000 
spectrophotometer  at  optical  density  (OD)  wavelength  of  260  nm.  Protein  and  organic 
contamination  were  measured  at  OD  280nm  and  230nm,  respectively.  Samples  that  passed 
quality  control  thresholds  were  then  run  on  a  1%  agarose  gel  to  assess  the  integrity  of  the 
gDNA.  500ng  of  gDNA  samples  were  run  on  the  Affymetrix  Human  SNP  Array  6.0  at  the 
Rockefeller  University  Genomics  Resource  Center  using  standard  operating  procedures. 
Signal  intensity  data  (.cel  files)  were  processed  using  the  Birdseed  v2.0  software  (Korn  et  al., 
Nat  Genet  40(10):  1253-60  (2008)). 

[0069]  STUDY  DESIGN 

[0070]  The  case  samples  in  this  study  were  either  metastatic  tumors  (METS)  or  primary 
tumors  from  men  treated  with  radical  prostatectomy  that  later  progressed  to  form  distant 
metastasis  (mPTs).  METS  and  mPTs  are  clearly  discemable  phenotypes  that  can  be  reliably 
classified  as  cases.  The  control  samples  were  defined  as  primary  tumors  that  had  not 
progressed  to  form  distant  metastases  following  radical  prostatectomy.  Given  that  radical 
prostatectomy  cures  both  indolent  primary  tumors  (iPTs)  that  would  not  metastasize  and 
primary  tumors  that  would  otherwise  progress  to  form  metastasis,  if  left  untreated,  the  control 
primary  tumors  would  actually  represent  a  mix  of  iPTs  and  unrealized  mPTs.  Assuming  a 
randomly  sampled  cohort,  it  is  expected  that  approximately  30%  of  the  control  group  of 
primary  tumors  would  be  unrealized  mPTs.  The  methods  developed  herein  required  only  the 
prior  information  of  whether  a  sample  was  derived  from  a  metastasis  and  were  designed  to  be 
robust  to  the  confounder  of  mixed  phenotypes. 

[0071]  METASTASIS  PREDICTION  MODEL  STATISTICS 

[0072]  A  weighted  Z-score  algorithm  was  developed  to  calculate  a  metastatic  potential 
score  (MPS)  as  described  in  Example  2,  with  a  higher  score  indicating  a  greater  likelihood  of 
metastasis.  The  predictive  power  of  the  instant  models  was  evaluated  through  cross- 
validation  testing.  Two  prediction  models  were  trained  using  a  combination  of  four  cohorts. 
The  first  model  was  trained  using  49  primary  tumors  of  unknown  clinical  outcome  from  NYU 
(n=29)  and  Baylor  (n=20)  and  a  metastasis  cohort  from  Hopkins  (n=13).  The  second  model 
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was  trained  using  75%  of  the  MSK  cohort  of  primary  tumors  of  unknown  outcome  (n=126) 
along  with  a  set  of  metastatic  tumors  (n=36).  The  gene  signatures  and  MPS  scores  derived 
from  these  2  models  were  combined  to  fit  a  logistic  regression  model  and  used  to  predict  bona 
fide  mPTs  (primary  tumors  that  later  developed  into  distant  metastasis)  and  a  random  sample 
of  25%  control  tumors  from  MSK  cohort  not  used  to  train  either  model.  Prediction  accuracy 
was  measured  by  the  area  under  the  receiver  operating  characteristic  curve  and  Kaplan-Meier 
metastasis-free  survival. 

[0073]  Example  2. 

[0074]  This  Example  describes  the  analytical  pipeline  for  developing  a  metastatic 
potential  clinical  risk  model. 

[0075]  An  analytical  pipeline  was  developed  using  the  R-statistical  software  1  comprised 
of  four  main  steps: 

[0076]  In  step  1,  copy  number  amplification  and  deletion  events  for  each  tumor  genome 
were  called.  A  tumor  genome's  signal  intensity  profile  was  referenced  (subtracted)  from  its 
matched  normal  genome  intensity  profile  resulting  in  a  copy  number  profile  for  each  tumor. 
Each  sample's  copy  number  profile  was  represented  numerically  as  -1,  0  or  1  (deletion,  no 
event,  or  amplification)  for  each  genomic  position  assayed  by  the  array.  A  summary 
metastasis  profile  (indexing  high  frequency  events)  was  also  created  where  -1  and  1  represent 
deletions  and  amplifications,  respectively,  observed  in  greater  than  25%  of  the  metastasis 
cohort. 

[0077]  In  step  2,  a  bootstrap  clustering  method  was  employed  to  develop  an  initial 
grouping  for  the  unknown  primary  tumors.  The  summary  copy  number  profile  for  the 
metastasis  samples  was  combined  with  the  individual  profiles  from  the  unknown  primary 
tumors  and  processed  using  hierarchical  clustering  (binary  distance  metric  and  complete 
clustering  method).  For  each  bootstrap  iteration,  a  subset  of  primary  tumors  were  sampled 
with  replacement  and  scored  1  if  they  were  in  the  same  cluster  as  the  metastasis  profile,  and  0 
if  they  were  in  the  other  cluster.  Using  the  results  from  20,000  iterations  of  the  clustering,  a 
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similarity  index  was  generated  for  each  sample,  representing  the  number  of  times  it  fell  in  a 
cluster  with  the  metastasis  profile.  A  sample  with  a  high  score  was  considered  to  be  more 
metastatic  (mPT),  while  lower  scoring  tumors  were  more  indolent  (iPT).  The  similarity  scores 
distributed  throughout  the  possible  range  of  values  (0  to  1),  allowing  the  formation  of  distinct 
groups  of  tumors  with  significant  contrast  between  high  and  low  metastatic  distance. 

[0078]  In  step  3,  these  mPT  and  iPT  contrast  groups  were  used  to  assess  quantitative  copy 
number  differences  on  a  probe  basis.  For  each  probe  on  the  array,  an  enrichment  score,  E(x), 
was  calculated,  which  represented  the  relative  amount  of  amplifications  versus  deletions, 
observed  in  each  subgroup  (metastasis,  mPT  and  iPT). 


(#  Amp-#  Del) 

t  (x)  = - - - - 

#  Samples 

[0079]  Next,  the  relative  enrichment  was  modeled  by  contrasting  the  metastasis  and  mPT 
copy  number  alterations  with  those  observed  in  the  iPT  group. 

SM  =  e[E(METS)  +  <i*E(mPT)  - E(iFT^ 

[0080]  The  first  two  enrichment  terms  being  summed  were  designed  to  assign  a  higher 
score  when  the  METS  and  mPT  samples  had  more  amplifications  than  deletions.  Greater 
amplification  enrichment  in  the  METS  and  mPTs  resulted  in  higher  scores.  The  third  term 
was  higher  when  the  iPT  samples  exhibit  the  opposite  effect  (enrichment  for  deletions  over 
amplifications).  The  middle  term  was  multiplied  by  a  data-driven  coefficient,  q,  representing 
the  average  contribution  of  mPT  on  a  probe  basis.  For  example,  probes  that  were  amplified  in 
all  metastases  and  mPTs,  but  deleted  in  all  iPTs  would  yield  the  highest  possible  score. 
Likewise,  probes  that  were  deleted  in  all  metastasis  and  mPT  samples,  but  amplified  in  all  iPT 
samples,  would  also  reach  this  maximum  possible  score.  The  probe  scores  were  then 
aggregated  by  gene  and  a  Z-score  was  calculated  to  assess  each  gene's  score  compared  to  the 
rest  of  the  genome. 
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[0081]  In  the  event  that  there  are  multiple  Z-scores  for  each  gene  (see  Table  6), 
corresponding  to  the  various  cohorts  used  to  generate  the  3  signatures.  Therefore,  each 
individual  will  have  3  different  MPS's.  The  final  MPS  (shown  in  Table  6)  is  calculated  by 
combining  the  3  MPSs  for  each  signature  using  a  variation  of  the  rank  method  described 
below. 

[0082]  The  Z-adjust  transforms  each  gene’s  Z-score  derived  from  the  above  three  steps  to 
fit  a  logistic  distribution  through  the  following  standard  function: 


^min" 

1  +  e  2 


[0083]  The  purpose  of  this  transformation  is  to  minimize  the  effect  of  any  individual 
gene’s  Z-score  on  the  overall  MPS  (makes  the  score  robust  to  outliers). 

[0084]  Finally,  in  step  4,  to  predict  whether  a  local  prostate  tumor  had  the  capability  to 
form  distant  metastasis,  a  weighted-Z  scoring  risk  model  was  developed  based  on  a  signature 
of  the  top  set  of  CNAs  overlapping  genomic  regions  as  determined  by  the  significance  of  their 
selection  model  Z-scores.  The  significant  genes  (Z  >  1.7)  were  used  from  step  3  as  a  cutoff 
point.  The  metastatic  prediction  risk  model  score  was  defined  as  the  following: 

M ( SM )  =  £ Zadjust,  * Dirsig (z)  * Dirsamp(i ) 

i 

[0085]  For  each  tumor  profile,  logistic  adjusted  Z-scores  ( Zadjust )  from  genes  (i. .  .n)  that 
match  the  metastasis  signature  were  added,  whereas  from  genes  that  mismatch  the  signature 
were  subtracted.  As  the  direction  component  of  the  risk  model  score  (Dir)  reflects,  if  the 
CNAs  of  the  signature  and  the  sample  are  in  the  same  direction,  the  coefficient  will  be  1 ;  if 
they  are  in  opposing  directions,  the  coefficient  will  be  -1;  and  if  Dirsamp(i)  =  0,  then  the 
entire  term  will  not  count  towards  the  score.  For  example,  if  a  gene  i,  that  is  typically 
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amplified  in  metastases  and  mPTs  is  also  amplified  in  the  unknown  profile,  that  Z-score  is 
added,  whereas  if  gene  i  in  the  profile  is  deleted,  as  expected  in  iPTs,  the  Z-score  is 
subtracted.  Neutral  genes  that  are  neither  amplified  nor  deleted  in  the  unknown  profile  are 
not  scored  in  this  model. 

[0086]  Example  3 

[0087]  This  Example  describes  the  results  achieved  by  the  predictive  metastasis  model 
developed  as  described  in  Examples  1-2. 

[0088]  METASTATIC  POTENTIAL  SCORE  DISTRIBUTIONS 

[0089]  Significant  differences  in  the  metastatic  potential  score  were  observed  for  the 
metastasis  (p  =  1.03E-18)  and  mPT  (p  =  0.005)  groups,  compared  to  the  control  primary 
tumors  (Figure  1  and  Table  3).  The  metastatic  potential  score  in  the  lymph  node  positive 
primary  tumors  (derived  from  the  MSK  (n  =  9)  and  SU  (n  =  9)  cohorts)  did  not  differ 
significantly  from  the  control  tumor  group  (Pmsk  =  0.23,  Psu  =  0.19,  Pcombined  =  0.08),  which 
reflected  the  marginal  ability  of  this  clinical  parameter  to  predict  distant  metastasis 
(BOORJIAN  et  al.,  Journal  of  Urology  178(3  Pt  l):864-70;  discussion  70-1  (2007)). 
Consistent  with  our  assumption  that  the  control  cohorts  contained  a  fraction  of  mPTs,  their 
metastatic  potential  score  overlapped  the  range  of  the  cases.  Furthermore,  control  primary 
tumors  that  did  not  recur  biochemically  (as  measured  by  PSA)  after  80  months  of  follow-up, 
(represented  by  Xs  in  Figure  1)  were  not  correlated  with  the  metastatic  potential  score.  To 
determine  whether  other  cancer  types  exhibited  a  similar  metastatic  landscape  of  CNAs  to  that 
observed  in  prostate  cancer,  the  metastatic  potential  scores  for  337  cancer  cell  lines  were 
calculated.  An  overall  distribution  that  overlapped  with  low-risk  prostate  primary  tumors  was 
observed  (Figure  1).  However,  22  of  the  337  cell  lines  emerged  above  the  75th  percentile  of 
the  prostate  primary  tumors  and  metastases,  ranked  by  MPS.  These  cell  lines  originated  from 
tumors  of  the  lung  (n=10),  breast  (n=3),  colon  (n=2)  and  melanoma  (n=2).  Other  singletons 
in  this  group  of  22  cell  lines  originated  from  thyroid,  rectum,  pharynx,  pancreas  and  kidney 
(Table  4). 
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[0090]  CROSS-VALIDATION  AND  SURVIVAL  ANALYSIS 


[0091]  A  cross-validation  analysis  predicting  a  subset  of  primary  tumors  (n  =  52)  not  used 
to  train  the  model  (n=  13  mPTs  and  n=  39  control  primary  tumors)  resulted  in  an  accuracy  of 
80.5%  as  measured  by  the  area  under  the  receiver  operating  characteristic  curve  (ROC-AUC) 
(Figure  2,  left  graph).  Considering  that  control  primary  tumors  were  a  mixture  of  treated 
mPTs  and  iPTs,  the  quality  of  fit  was  believed  to  be  an  underestimate.  Applying  the  instant 
prediction  to  a  Kaplan-Meier  analysis  with  the  clinical  endpoint  of  metastasis-free  survival 
(Figure  2,  right  graph)  resulted  in  a  significant  separation  (p=0.014)  of  the  low-risk  half  of 
the  cohort  (based  on  the  metastatic  potential  score)  compared  to  the  high-risk  half.  A  one- 
point  increase  in  the  metastatic  potential  score  corresponded  to  an  odds  ratio  of  6.3  for 
progression  to  metastasis  (p  =  0.01). 

[0092]  BIOMARKER  FUNCTIONAL  SIGNIFICANCE 

[0093]  Many  of  the  top  ranking  metastasis  genes  identified  through  the  analysis  have 
molecular  functions  related  to  alteration  of  nuclear  and  extra-cellular  matrix  structure  and 
metabolic  modification  that  enhance  processes  characteristic  of  metastasis,  such  as  motility, 
invasion,  and  escape  from  anoikis.  A  heat  map  of  the  CNA  events  of  signature  genes  for  all 
prostate  tumors  is  suggestive  of  a  path  toward  the  different  high  frequency  amplification 
versus  deletion  events  that  contrast  the  high-risk  and  low-risk  tumors.  The  mid-risk  region 
with  its  relative  paucity  of  genomic  events  may  represent  the  starting  point  of  two  alternative 
pathways  of  subsequent  copy  number  alteration,  one  leading  to  metastasis  and  the  other  to  an 
indolent  state.  The  locking  in  of  these  'anti-metastasis'  events  in  indolent  tumors  may  explain 
why  they  fail  to  metastasize  despite  extended  periods  of  watchful  waiting. 

[0094]  Many  of  the  genes  within  these  amplified  or  deleted  regions  from  which  the 
predictive  signature  was  derived  have  been  shown  previously  to  play  a  role  in  prostate  cancer 
metastasis.  One  of  the  top  predictor  genes,  the  solute  carrier  family  SFC7A5  gene  deleted  on 
chromosome  16q24.2,  encodes  a  neutral  amino  acid  transporter  protein  (FAT1)  that  has  been 
implicated  in  multiple  cancers  (prostate  (Sakata  et  al.,  Pathol  Int  59(1):7-18  (2009)),  breast 
(Kaira  et  al..  Cancer  Science  99(12):2380-6  (2008)),  ovarian  (Kaji  et  al.,  International 
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Journal  of  Gynecol  Cancer  20(3):329-36  (2010)),  lung  (Imai  et  al.,  Histopathology 
54(7):804-13  (2009))  and  brain  (Kobayashi  et  al..  Neurosurgery  62(2):493-503;  discussion  -4 
(2008)))  and  has  been  shown  to  have  utility  as  a  diagnostic  (Bartlett  et  al.,  Breast  Cancer 
Research  12(4):R47  (2010);  Ring  et  al.,  Mod  Pathol  22(8):  1032-43  (2009);  Ring  et  al., 
Journal  of  Clinical  Oncology  24(19):3039-47  (2006))  and  drug  target  in  cell  line  (Fan  et  al., 
Biochem  Pharmacol  80(6):81 1-8  (2010);  Yamauchi  et  al.,  Cancer  Letter  276(1):95-101 
(2009);  Kim  et  al.,  Biol  Pharm  Bull  31(6):1096-100  (2008))  and  pre-clinical  animal  models 
(Oda  et  al..  Cancer  Science  101(1):  173-9  (2010)).  The  normal  function  of  LAT1  is  to 
regulate  cellular  amino  acid  concentrations  —  L-glutamine  (efflux)  and  L-leucine  (influx). 
Reduced  activity  of  LAT1  results  in  increased  concentrations  of  L-glutamine  which  has  been 
shown  to  constitutively  fuel  mTOR  activity  (Nicklin  et  al..  Cell  136(3):521-34  (2009))  and 
targeting  of  glutamine  utilization  through  the  use  of  a  glutamine  analog,  dramatically  reduced 
tumor  growth  and  metastasis  in  cellular  and  in  vivo  mouse  models  (Shelton  et  al., 
International  Journal  of  Cancer  127(10):2478-85).  Seven  other  solute  carrier  superfamily 
members  (SLC05A1,  SLC7A2,  SLC10A5,  SLC26A7,  SLC25A37,  SLC38A8  and 
SLC39A14)  were  predictive  of  metastatic  potential  in  the  models  disclosed  herein.  A  ninth 
SLC  gene,  SLC44A1,  encoding  a  choline  transporter  (Michel  et  al.,  Faseb  /  23(8):2749-58 
(2009)),  was  identified  as  part  a  of  a  17-gene  expression  signature,  comparing  prostate 
primary  tumors  of  men  treated  with  radical  prostatectomy  that  metastasized  versus  men  that 
recurred  biochemically,  but  did  not  metastasize  (Nakagawa  et  al.,  supra). 

[0095]  A  second  set  of  signature  genes  includes  6  Cadherin  family  members  encoding 
calcium  dependent  cell  adhesion  glycoproteins  (CDH2,  CDH8,  CDH13,  CDH15,  CDH17  and 
PCDH9).  Many  of  the  Cadherin  family  proteins  have  putative  functions  associated  with 
metastasis  progression  (Yilmaz  et  al.,  Mol  Cancer  Res  8(5):629-42,  2010)  and  have  been 
included  in  diagnostic  panels  (Celebiler  et  al.,  Cancer  Sci  100(12):2341-5  (2009);  Lu  et  al., 
PLoS  Med  3(12):e467  (2006)).  A  recent  study  of  monoclonal  antibody  treatment  targeting 
CDH2  inhibited  prostate  cancer  growth  and  metastasis  in  androgen  independent  prostate 
cancer  xenograft  models  (Tanaka  et  al.,  Nat  Med  (2010)  16:1414-20). 
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[0096]  A  third  set  of  6  genes  predicted  to  contribute  to  metastatic  potential  were 
potassium  channels  KCNB2,  KCNQ3,  KCNAB1,  KCTD8,  KCTD9  and  KCNH4.  Three  other 
potassium  channels  reside  in  the  highly  amplified  region  between  8ql3  and  8q24  (KCNS2, 
KCNV 1  and  KCNK9)  that  did  not  rank  high  in  our  analysis,  but  may  have  weak  or  modifier 
effects.  High  levels  of  cytoplasmic  potassium  ion  concentration  are  maintained  by  BCL-2,  a 
putative  oncogene,  through  the  inhibition  of  potassium  channel  transcription.  These  high 
levels  were  shown  to  inhibit  a  necessary  precursor  to  the  hallmark  mitochondrial  apoptotic 
cascade  of  membrane  disruption  and  ensuing  release  of  cytochrome  C,  caspase,  and  nuclease 
degradation  of  cellular  components  (Ekhterae  et  al.,  American  Journal  of  Physiol  Cell  Physiol 
2001;281(l):C157-65  (2001)).  Furthermore,  another  study  has  shown  that  the  hyper- 
methylation  status  of  potassium  channel  KCNMA1  (10q22.3)  has  predictive  value  for  prostate 
cancer  recurrence  (Vanaja  et  al.,  Cancer  Invest  27(5):549-60  (2009)).  The  activity  of  voltage¬ 
gated  potassium  channels  in  prostate  cancer  cell  lines  LNCaP  (low  metastatic  potential)  and 
PC3  (high  metastatic  potential),  were  observed  to  be  markedly  different  (Laniado  et  al., 
Prostate  46(4):262-74  (2001)).  Mounting  evidence  has  also  been  observed  in  the 
involvement  of  potassium  channels  and  the  migration  of  breast  cancer  cells  (Zhang  et  al., 
Sheng  Li  Xue  Bao  61(1):  15-20  (2009)). 

[0097]  The  complete  set  of  metastasis  signature  genes  used  in  the  prediction  model  (n  = 
368,  Table  6)  represent  various  subsets  of  functions,  revealing  a  unique  profile  necessary  for 
each  tumor  to  progress  to  metastasis. 

[0098]  Example  4 

[0099]  RANKING  METASTASIS  GENES  ON  THE  BASIS  OF  PREDICTABILITY 

[00100]  The  metastatic  potential  score  as  derived  from  the  complete  set  of  368  metastasis 
genes  resulted  in  a  predictive  accuracy  of  AUC  =  81%  in  the  cohort  described  in  Examples  1- 
3.  To  determine  the  hierarchy  of  the  genes  that  contribute  to  this  prediction,  several 
simulations  (K)  were  performed  by  randomly  sampling  subsets  genes  (n)  from  the  368  genes, 
where  n=20,  40,  50,  80,  100.  This  procedure  sought  to  identify  those  genes  that  maximize 
the  prediction  accuracy  (AUC  =  81%)  while  also  maximizing  the  regression  coefficient 
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between  the  MPS  scores  from  the  368  genes  versus  any  random  iteration  of  the  randomly 
sampled  subset  of  genes.  For  example  a  random  subset  of  20  genes  that  achieves  a  prediction 
accuracy  =  81%  and  an  r2  =  1.0  compared  to  the  MPS  derived  from  the  368  gene  signature 
would  achieve  the  theoretical  best  performance  (Figure  3). 

[00101]  Once  gene  rankings  for  the  5  simulations  were  determined,  ranks  G  positions 
across  K  analyses  were  evaluated  using  a  non-parametric  ranking  method  (Breitling  et  ah, 
FEBS  Lett  573:  83-92,  (2004)): 


*(G)  =  £  log 


/ 1 A 


G 


k  ) 


[00102]  This  method  was  selected  as  an  improvement  to  a  simple  average  of  the  ranks  of 
each  G  across  the  k  analyses  because  it  gives  more  emphasis  to  having  a  high  rank  in  any  one 
of  the  analyses,  regardless  of  rank  in  the  others.  This  model  of  rank  integration  gives  more 
weight,  for  example,  to  a  gene  ranked  #1  and  #100  in  two  different  analyses  than  to  a  gene 
ranked  #100  in  each. 

[00103]  To  evaluate  the  performance  of  this  method,  the  composite  ranked  hierarchy  of 
genes  was  assessed  using  an  extending  window.  Starting  with  a  minimum  of  12  genes,  and 
adding  one  gene  every  iteration,  an  AUC  and  r2  were  calculated.  The  results  in  Figure  4 
show  that  the  AUC  plateaus  at  -80  genes,  achieving  the  optimal  AUC  -0.81  and  n  >  0.95. 
Specifically,  Table  5  shows  the  results  for  the  top  12,  20,  40,  80  and  100  genes. 

[00104]  The  ranking  of  the  368  genes  is  shown  in  Table  6. 

[00105]  Example  5. 

[00106]  REPORTING  PREDICTION  TO  PATIENTS 

[00107]  The  prostate  cancer  metastatic  potential  score,  assessed  through  a  Cox  proportional 
hazards  ratio  model  provides  the  basis  for  determining  metastasis-free  probability.  In  Figure 
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5  (left  panel,  ROC  curve),  a  conservative  threshold  that  maximizes  our  sensitivity  (at  100%  or 
1.0  on  the  Y-axis  of  the  ROC  curve)  was  chosen  to  identify  all  true  positives  (i.e.  men  that 
will  progress  to  metastasis).  Within  this  high-risk  group  there  is  a  false  positive  rate  of  59% 
(men  who  would  not  otherwise  have  developed  metastasis),  which  will  result  in  some  men 
with  low-risk  prostate  cancer  to  be  treated  aggressively.  However,  currently  100%  of  the  men 
are  treated  aggressively,  so  the  conservative  threshold  herein  would  enable  31%  of  them  to  be 
spared  aggressive  treatment. 

[00108]  Applying  this  conservative  threshold  to  a  Kaplan-Meier  analysis  of  a  Cox 
proportional  hazards  model  (Figure  5,  right  panel)  results  in  low  risk  and  high  risk 
probabilities  of  metastasis  free  survival  at  various  time  intervals.  Therefore,  for  this  model,  a 
man  with  a  low  risk  designation  will  have  a  very  low  (<  5%)  chance  to  develop  metastasis  in 
10  years.  While  the  high  risk  designation  results  in  a  40%  chance  of  progressing  to  metastasis 
in  60  months  and  a  >  90%  chance  of  progressing  to  metastasis  in  10  years. 

[00109]  As  a  comparison,  the  FDA  approved  breast  cancer  gene  expression  signature 
diagnostic  “MammaPrint”  uses  similar  Cox  proportional  hazards  analysis  to  develop  their  risk 
reporting  strategy  (Bogarts  et  al.,  Nat  Clin  Pract  Oncol  3:540-51,  2006).  Currently,  the  FDA 
low  risk  assignment  has  a  10%  chance  of  progressing  to  metastatic  disease  in  10  years,  while 
the  high  risk  assignment  has  a  29%  chance  of  progressing  in  10  years. 
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WHAT  IS  CLAIMED  IS: 


1 .  A  method  of  determining  the  risk  of  metastasis  of  prostate  cancer  in  a  human  subject 
who  has  or  had  prostate  cancer,  the  method  comprising 

(a)  determining  in  a  prostate  sample  from  the  subject  the  number  of  copies  per  cell 
of  at  least  12  genes  and/or  genomic  regions  of  a  metastatic  gene  signature  set,  wherein  the 
metastatic  gene  signature  set  consists  of  the  PPP3CC  genomic  region,  the  SLC05A1  genomic 
region,  the  SLC7A5  genomic  region,  the  SLC7A2  genomic  region,  the  CRISPLD2  genomic 
region,  the  CDH13  gene,  the  CDH8  gene,  the  CDH2  gene,  the  ASAH1  genomic  region,  the 
KCNB2  genomic  region,  the  KCNH4  genomic  region,  the  CTD8  gene,  the  JPH1  genomic 
region,  the  MEST  genomic  region,  the  NCALD  genomic  region,  the  COL19A1  gene,  the 
MAP3K7  genomic  region,  the  YWHAG  gene,  the  NOL4  genomic  region,  and  the  ENOX1 
gene, 

(b)  determining  alternations  in  the  number  of  copies  per  cell  for  each  of  the  at  least 
12  genes  and/or  genomic  regions  as  compared  to  the  number  of  copies  per  cell  in  non-cancer 
cells,  and 

(c)  determining  the  risk  of  prostate  cancer  metastasis  based  on  the  copy  number 
alternations  (CNAs)  determined  in  step  (b). 

2.  The  method  of  claim  1,  wherein  the  at  least  12  genes  and/or  genomic  regions  include 
the  PPP3CC  genomic  region,  the  SLC05A1  genomic  region,  the  SLC7A5  genomic  region, 
the  SLC7A2  genomic  region,  the  CRISPLD2  genomic  region,  the  CDH13  gene,  the  CDH8 
gene,  the  CDH2  gene,  the  ASAH1  genomic  region,  the  KCNB2  genomic  region,  the  KCNH4 
genomic  region,  and  the  CTD8  gene. 

3.  The  method  of  claim  1,  wherein  the  at  least  12  genes  and/or  genomic  regions  include 
all  of  the  genes  and  genomic  regions  in  said  metastatic  gene  signature  set. 

4.  The  method  of  claim  1  or  3,  wherein  an  increase  in  the  copy  number  per  cell  for  any  of 
the  SLC05A1  genomic  region,  the  KCNB2  genomic  region,  the  KCNH4  genomic  region,  the 
JPH1  genomic  region,  the  NCALD  genomic  region,  or  the  YWHAG  gene,  correlates  with  an 
increased  risk  of  prostate  cancer  metastasis;  and  a  decrease  in  the  copy  number  per  cell  for 
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any  of  the  PPP3CC  genomic  region,  the  SLC7A5  genomic  region,  the  SLC7A2  genomic 
region,  the  CRISPLD2  genomic  region,  the  CDH13  gene,  the  CDH8  gene,  the  CDH2  gene, 
the  ASAH1  genomic  region,  the  CTD8  gene,  the  MEST  genomic  region,  the  COL19A1  gene, 
the  MAP3K7  genomic  region,  the  NOL4  genomic  region,  or  the  ENOX1  gene,  correlates  with 
an  increased  risk  of  prostate  cancer  metastasis. 

5.  The  method  of  claim  1,  wherein  the  PPP3CC  genomic  region  comprises  the  genes 
PPP3CC,  KIAA1967,  BIN3,  SORBS3,  PDLIM2,  RHOBTB2,  SLC39A14,  EGR3,  and 
C8orf58. 

6.  The  method  of  claim  1,  wherein  the  SLC05A1  genomic  region  comprises  the  genes 
SLC05A1,  SULF1,  NCOA2,  CPA6,  C8orf34,  PRDM14,  and  PREX2. 

7.  The  method  of  claim  1,  wherein  the  SLC7A5  genomic  region  comprises  the  genes 
SLC7A5,  CA5A,  BANP,  KLHDC4,  CYBA,  JPH3,  ZFPM1,  SNAI3,  ZC3H18,  MVD,  IL17C, 
C16orf85,  and  RNF166. 

8.  The  method  of  claim  1,  wherein  the  SLC7A2  genomic  region  comprises  the  genes 
SLC7A2,  MTMR7  and  MUTS1. 

9.  The  method  of  claim  1,  wherein  the  CRISPLD2  genomic  region  comprises  the  genes 
CRISPLD2,  ZDHHC7,  KIAA0513,  KLHL36,  and  USP10. 

10.  The  method  of  claim  1,  wherein  the  ASAH1  genomic  region  comprises  the  genes 
ASAH1  and  PCM1. 

11.  The  method  of  claim  1,  wherein  the  KCNB2  genomic  region  comprises  the  genes 
KCNB2,  EYA1,  XKR9,  and  TRPA1. 

12.  The  method  of  claim  1,  wherein  the  KCNH4  genomic  region  comprises  the  genes 
KCNH4,  RAB5C,  DHX58,  KAT2A,  and  HSPB9. 

13.  The  method  of  claim  1,  wherein  the  JPH1  genomic  region  comprises  the  genes  JPH1, 
HNF4G,  CRISPLD1,  PI15,  and  GDAP1. 
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14.  The  method  of  claim  1,  wherein  the  MEST  genomic  region  comprises  the  genes 
MEST,  COPG2,  CPA5,  CPA2,  CPA1,  CPA4,  and  TSGA14. 

15.  The  method  of  claim  1,  wherein  the  NCALD  genomic  region  comprises  the  genes 
NCALD,  ZNF706,  GRHL2,  and  YWHAZ. 

16.  The  method  of  claim  1,  wherein  the  MAP3K7  genomic  region  comprises  the  genes 
MAP3K7  and  EPHA7. 

17.  The  method  of  claim  1,  wherein  the  NOL4  genomic  region  comprises  the  genes  NOL4 
and  DTNA. 

18.  The  method  of  claim  3,  further  comprising  determining  the  number  of  copies  per  cell 
of  at  least  one  additional  gene  or  genomic  region  listed  in  Table  6. 

19.  The  method  of  claim  18,  wherein  said  at  least  one  additional  gene  or  genomic  region 
comprises  20  genes  and/or  genomic  regions  listed  in  Table  6. 

20.  The  method  of  claim  19,  said  20  genes  and/or  genomic  regions  are  the  genes  and 
genomic  regions  ranked  21-40  in  Table  6. 

21.  The  method  of  claim  1,  wherein  the  copy  number  of  a  gene  or  genomic  region  is 
determined  using  a  nucleic  acid  probe  that  hybridizes  to  the  gene  or  genomic  region  in  the 
genomic  DNA  present  in  the  sample. 

22.  The  method  of  claim  21,  wherein  hybridization  is  performed  in  an  array  format. 

23.  The  method  of  claim  4,  wherein  the  risk  is  determined  based  on  calculating  a 
metastatic  potential  score: 

M(SM)  =  Yj  Zadjust,  *  Dirsig  (i)  *  Dirsamp (;') 
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wherein  the  logistic  adjusted  Z-scores  ( Zacljust )  for  each  of  the  genes  of  the  metastatic 
signature  set  are  set  forth  in  Table  6  and  wherein  if  the  CNAs  of  the  signature  and  the  sample 
are  in  the  same  direction,  the  coefficient  (Dir)  will  be  1;  if  they  are  in  opposite  directions,  the 
coefficient  will  be  -1;  and  if  no  alternation  in  copy  number  is  detected  for  a  gene,  the 
coefficient  for  that  gene  =  0;  and  comparing  the  metastatic  potential  score  to  a  control  value, 
wherein  an  increase  in  the  score  correlates  with  an  increased  risk  of  metastasis. 

24.  A  kit  for  determining  the  risk  of  metastasis  of  prostate  cancer  in  a  human  subject  who 
has  or  had  prostate  cancer,  comprising  a  plurality  of  nucleic  acid  probes  which  specifically 
hybridize  to  at  least  12  genes  and/or  genomic  regions  of  a  metastatic  gene  signature  set, 
wherein  the  metastatic  gene  signature  set  consists  of  the  PPP3CC  genomic  region,  the 
SLC05A1  genomic  region,  the  SLC7A5  genomic  region,  the  SLC7A2  genomic  region,  the 
CRISPLD2  genomic  region,  the  CDH13  gene,  the  CDH8  gene,  the  CDH2  gene,  the  ASAH1 
genomic  region,  the  KCNB2  genomic  region,  the  KCNH4  genomic  region,  the  CTD8  gene, 
the  JPH1  genomic  region,  the  MEST  genomic  region,  the  NCALD  genomic  region,  the 
COL19A1  gene,  the  MAP3K7  genomic  region,  the  YWHAG  gene,  the  NOL4  genomic 
region,  and  the  ENOX1  gene. 

25.  The  kit  of  claim  24,  comprising  a  plurality  of  nucleic  acid  probes  which  specifically 
hybridize  to  all  of  the  20  genes  and/or  genomic  regions  of  said  metastatic  gene  signature  set. 
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ABSTRACT 


A  method  of  determining  the  risk  of  metastasis  of  prostate  cancer  in  a  human  subject 
who  has  or  had  prostate  cancer  is  disclosed  herein.  The  method  is  based  on  detecting  in  a 
prostate  sample  from  the  subject  the  number  of  copies  per  cell  of  genes  and/or  genomic 
regions  of  a  metastatic  gene  signature  set  disclosed  herein,  and  determining  alternations  in  the 
number  of  copies  per  cell  of  the  genes  and/or  genomic  regions  in  the  signature  set,  as 
compared  to  the  number  of  copies  per  cell  in  non-cancer  cells,  thereby  determining  the  risk  of 
prostate  cancer  metastasis. 
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Table  2|  NYU  cohort  sample  information 


compositelD 

Race 

tumor_type 

Stage 

Gleason  (primary) 

Gleason  (secondary) 

Age  at  prostatectomy  (years) 

CA_1 

CA 

primary 

T2c 

3 

3 

70 

CA_2 

CA 

primary 

T4 

3 

3 

59 

AA_3 

AA 

primary 

T2c 

3 

4 

76 

CA_4 

CA 

primary 

T2c 

3 

3 

58 

CA_5 

CA 

primary 

T4 

3 

3 

73 

CA_6 

CA 

primary 

T3a 

3 

4 

67 

CA_7 

CA 

primary 

T4 

4 

3 

68 

CA_8 

CA 

primary 

T2c 

3 

3 

64 

CA_9 

CA 

primary 

T3a 

4 

4 

72 

CA_10 

CA 

primary 

T2b 

3 

3 

69 

CA_11 

CA 

primary 

T2c 

3 

4 

60 

CA12 

CA 

primary 

T2c 

3 

3 

63 

CA_13 

CA 

primary 

T3a 

4 

4 

58 

CA_14 

CA 

primary 

T2c 

4 

4 

64 

CA_16 

CA 

primary 

T2c 

3 

3 

65 

CA17 

CA 

primary 

T3b 

3 

4 

67 

CA_18 

CA 

primary 

T3a 

3 

3 

68 

CA_19 

CA 

primary 

T3a 

4 

3 

68 

CA_20 

CA 

primary 

T2a 

4 

4 

56 

AA_21 

AA 

primary 

T2b 

3 

3 

62 

AA_22 

AA 

primary 

T2b 

3 

5 

53 

AA_23 

AA 

primary 

T3a 

4 

5 

47 

AA_24 

AA 

primary 

T3b 

3 

4 

53 

AA_25 

AA 

primary 

T2b 

3 

4 

58 

CA26 

CA 

primary 

T4 

3 

4 

64 

AA_27 

AA 

primary 

T2b 

3 

3 

64 

AA_28 

AA 

primary 

T2b 

3 

3 

62 

AA_29 

AA 

primary 

T2b 

3 

3 

67 

CA_30 

CA 

primary 

T2b 

3 

4 

45 

Table  3|  Prostate  tumor  metastatic  potential  score 


samplelD 

MPS 

cohort 

subgroup 

samplelD 

MPS 

cohort 

subgroup 

samplelD 

MPS 

cohort 

subgroup 

M_3 

1.70 

Hopkins 

METS 

GSM525630 

0.57 

MSK 

PT  Control 

GSM525747 

1.63 

MSK 

PT  Control 

M_16 

1.18 

Hopkins 

METS 

GSM525631 

1.27 

MSK 

PT  Control 

GSM525748 

1.06 

MSK 

PT  Control 

M_17 

1.14 

Hopkins 

METS 

GSM525632 

0.37 

MSK 

PT  Control 

GSM525749 

0.42 

MSK 

PT  Control 

M_19 

1.26 

Hopkins 

METS 

GSM525633 

1.01 

MSK 

PT  Control 

GSM525750 

1.24 

MSK 

PT  Control 

M_21 

1.51 

Hopkins 

METS 

GSM525634 

1.32 

MSK 

mPT 

GSM525751 

1.53 

MSK 

PT  Control 

M_22 

1.54 

Hopkins 

METS 

GSM525635 

0.63 

MSK 

PT  Control 

GSM525752 

0.93 

MSK 

PT  Control 

M_24 

1.26 

Hopkins 

METS 

GSM525636 

0.31 

MSK 

PT  Control 

GSM525753 

1.73 

MSK 

ln+ 

M_28 

1.92 

Hopkins 

METS 

GSM525637 

1.39 

MSK 

PT  Control 

GSM525754 

1.66 

MSK 

mPT 

M_30 

1.18 

Hopkins 

METS 

GSM525638 

1.18 

MSK 

PT  Control 

GSM525755 

2.06 

MSK 

PT  Control 

M_31 

1.85 

Hopkins 

METS 

GSM525639 

0.54 

MSK 

PT  Control 

GSM525756 

1.77 

MSK 

METS 

M_32 

1.85 

Hopkins 

METS 

GSM525640 

1.08 

MSK 

PT  Control 

GSM525757 

1.11 

MSK 

METS 

M_33 

1.29 

Hopkins 

METS 

GSM525641 

0.28 

MSK 

PT  Control 

GSM525758 

1.43 

MSK 

METS 

M_34 

1.49 

Hopkins 

METS 

GSM525642 

1.35 

MSK 

PT  Control 

GSM525759 

1.56 

MSK 

METS 

CA_1 

0.80 

NYU 

PT  Control 

GSM525643 

1.14 

MSK 

PT  Control 

GSM525760 

1.95 

MSK 

METS 

CA_2 

0.48 

NYU 

PT  Control 

GSM525644 

1.08 

MSK 

PT  Control 

GSM525761 

1.03 

MSK 

METS 

AA_3 

1.61 

NYU 

PT  Control 

GSM525645 

0.58 

MSK 

PT  Control 

GSM525762 

1.61 

MSK 

METS 

CA_4 

1.64 

NYU 

PT  Control 

GSM525646 

0.95 

MSK 

PT  Control 

GSM525763 

2.08 

MSK 

METS 

CA_5 

0.65 

NYU 

PT  Control 

GSM525647 

1.55 

MSK 

mPT 

GSM525764 

2.05 

MSK 

METS 

CA_6 

1.38 

NYU 

PT  Control 

GSM525648 

1.00 

MSK 

PT  Control 

GSM525765 

2.00 

MSK 

METS 

CA_7 

1.39 

NYU 

PT  Control 

GSM525649 

0.70 

MSK 

PT  Control 

GSM525766 

1.29 

MSK 

METS 

CA_8 

0.72 

NYU 

PT  Control 

GSM525650 

0.96 

MSK 

PT  Control 

GSM525767 

2.04 

MSK 

METS 

CA_9 

1.49 

NYU 

PT  Control 

GSM525651 

0.98 

MSK 

PT  Control 

GSM525768 

2.01 

MSK 

METS 

CA_10 

0.77 

NYU 

PT  Control 

GSM525652 

0.89 

MSK 

PT  Control 

GSM525769 

2.02 

MSK 

METS 

CA_11 

0.75 

NYU 

PT  Control 

GSM525653 

0.66 

MSK 

ln+ 

GSM525770 

1.83 

MSK 

METS 

CA_12 

0.90 

NYU 

PT  Control 

GSM525654 

1.62 

MSK 

PT  Control 

GSM525771 

2.15 

MSK 

METS 

CA_13 

0.58 

NYU 

PT  Control 

GSM525655 

1.03 

MSK 

PT  Control 

GSM525772 

1.57 

MSK 

METS 

CA_14 

0.89 

NYU 

PT  Control 

GSM525656 

0.27 

MSK 

PT  Control 

GSM525773 

1.44 

MSK 

METS 

CA_16 

0.67 

NYU 

PT  Control 

GSM525657 

0.80 

MSK 

ln+ 

GSM525774 

1.63 

MSK 

METS 

CA_17 

0.86 

NYU 

PT  Control 

GSM525658 

0.67 

MSK 

PT  Control 

GSM525775 

1.16 

MSK 

METS 

CA_18 

0.76 

NYU 

PT  Control 

GSM525659 

0.92 

MSK 

PT  Control 

GSM525776 

1.80 

MSK 

METS 

CA_19 

1.10 

NYU 

PT  Control 

GSM525660 

0.53 

MSK 

PT  Control 

GSM525777 

1.22 

MSK 

METS 

CA_20 

0.19 

NYU 

PT  Control 

GSM525661 

1.72 

MSK 

PT  Control 

GSM525778 

1.47 

MSK 

METS 

AA_21 

0.25 

NYU 

PT  Control 

GSM525662 

0.57 

MSK 

PT  Control 

GSM525779 

1.59 

MSK 

METS 

AA_22 

1.53 

NYU 

PT  Control 

GSM525663 

0.59 

MSK 

PT  Control 

GSM525780 

1.64 

MSK 

METS 

AA_23 

0.29 

NYU 

PT  Control 

GSM525664 

0.06 

MSK 

PT  Control 

GSM525781 

1.23 

MSK 

METS 

AA_24 

1.30 

NYU 

PT  Control 

GSM525665 

0.28 

MSK 

PT  Control 

GSM525782 

1.94 

MSK 

METS 

AA_25 

0.91 

NYU 

PT  Control 

GSM525666 

1.50 

MSK 

mPT 

GSM525783 

1.80 

MSK 

METS 

CA_26 

0.56 

NYU 

PT  Control 

GSM525667 

0.46 

MSK 

PT  Control 

GSM525784 

0.91 

MSK 

METS 

AA_27 

0.83 

NYU 

PT  Control 

GSM525668 

1.38 

MSK 

PT  Control 

GSM525785 

1.98 

MSK 

METS 

AA_28 

0.50 

NYU 

PT  Control 

GSM525669 

0.81 

MSK 

PT  Control 

GSM525786 

1.84 

MSK 

METS 

AA_29 

0.52 

NYU 

PT  Control 

GSM525670 

0.55 

MSK 

PT  Control 

GSM525787 

2.10 

MSK 

METS 

CA_30 

1.31 

NYU 

PT  Control 

GSM525671 

0.57 

MSK 

PT  Control 

GSM525788 

1.65 

MSK 

METS 

AAN_24 

0.90 

Baylor 

PT  Control 

GSM525672 

0.77 

MSK 

ln+ 

GSM525789 

1.06 

MSK 

METS 

AAN_25 

1.42 

Baylor 

PT  Control 

GSM525673 

1.81 

MSK 

PT  Control 

GSM525790 

2.16 

MSK 

METS 

AAN_27 

0.87 

Baylor 

PT  Control 

GSM525674 

0.63 

MSK 

PT  Control 

GSM525791 

1.85 

MSK 

METS 

AAN_31 

1.25 

Baylor 

PT  Control 

GSM525675 

1.01 

MSK 

PT  Control 

GSM525792 

1.12 

MSK 

METS 

AAN_45 

1.30 

Baylor 

PT  Control 

GSM525676 

0.37 

MSK 

PT  Control 

PT130 

1.35 

SU 

PT  Control 

AAN_52 

0.47 

Baylor 

PT  Control 

GSM525677 

0.90 

MSK 

PT  Control 

PL133 

0.95 

su 

ln+ 

AAN_58 

1.18 

Baylor 

PT  Control 

GSM525678 

1.61 

MSK 

PT  Control 

PT138 

0.64 

SU 

PT  Control 

AAN_60 

0.97 

Baylor 

PT  Control 

GSM525679 

0.47 

MSK 

PT  Control 

PT171 

1.16 

su 

PT  Control 

AAN_75 

0.60 

Baylor 

PT  Control 

GSM525680 

0.89 

MSK 

PT  Control 

PT173 

0.89 

su 

PT  Control 

AAN_110 

0.53 

Baylor 

PT  Control 

GSM525681 

0.89 

MSK 

PT  Control 

PT174 

1.16 

su 

PT  Control 

AAN_115 

0.43 

Baylor 

PT  Control 

GSM525682 

0.48 

MSK 

PT  Control 

PT176 

0.72 

su 

PT  Control 

AAN_122 

1.05 

Baylor 

PT  Control 

GSM525683 

0.49 

MSK 

PT  Control 

PT177 

0.60 

su 

PT  Control 

AAN_128 

1.36 

Baylor 

PT  Control 

GSM525684 

0.32 

MSK 

PT  Control 

PT180 

0.67 

su 

PT  Control 

AAN_137 

0.17 

Baylor 

PT  Control 

GSM525685 

0.72 

MSK 

PT  Control 

PT181 

0.93 

su 

PT  Control 

AAN_138 

0.87 

Baylor 

PT  Control 

GSM525686 

1.20 

MSK 

PT  Control 

PT308 

1.46 

su 

PT  Control 

AAN_140 

1.14 

Baylor 

PT  Control 

GSM525687 

0.66 

MSK 

PT  Control 

PT311 

0.94 

su 

PT  Control 

AAN_154 

1.13 

Baylor 

PT  Control 

GSM525688 

1.72 

MSK 

PT  Control 

PT309 

1.10 

su 

PT  Control 

AAN_167 

1.31 

Baylor 

PT  Control 

GSM525689 

0.86 

MSK 

PT  Control 

PT312 

0.61 

su 

PT  Control 

AAN_80 

1.01 

Baylor 

PT  Control 

GSM525690 

1.22 

MSK 

PT  Control 

PT313 

0.54 

su 

PT  Control 

AAN_96 

1.15 

Baylor 

PT  Control 

GSM525691 

1.42 

MSK 

PT  Control 

PT310 

0.66 

su 

PT  Control 

GSM525575 

1.67 

MSK 

PT  Control 

GSM525692 

0.22 

MSK 

PT  Control 

PT100 

1.07 

su 

PT  Control 

GSM525576 

1.83 

MSK 

PT  Control 

GSM525693 

1.61 

MSK 

PT  Control 

PT148 

0.39 

su 

PT  Control 

GSM525577 

0.87 

MSK 

ln+ 

GSM525694 

0.67 

MSK 

PT  Control 

PT32 

0.81 

su 

PT  Control 

GSM525578 

1.27 

MSK 

PT  Control 

GSM525695 

0.42 

MSK 

PT  Control 

PT37 

0.68 

su 

PT  Control 

GSM525579 

1.39 

MSK 

PT  Control 

GSM525696 

0.37 

MSK 

PT  Control 

PT314 

1.07 

su 

PT  Control 

GSM525580 

1.20 

MSK 

PT  Control 

GSM525697 

0.90 

MSK 

PT  Control 

PT319 

1.12 

su 

PT  Control 

GSM525581 

0.01 

MSK 

PT  Control 

GSM525698 

0.25 

MSK 

PT  Control 

PT317 

1.04 

su 

PT  Control 

GSM525582 

0.76 

MSK 

PT  Control 

GSM525699 

1.20 

MSK 

PT  Control 

PT316 

1.58 

su 

PT  Control 

GSM525583 

0.65 

MSK 

PT  Control 

GSM525700 

0.57 

MSK 

PT  Control 

PT315 

1.65 

su 

PT  Control 

...continued  Table  3  |  Prostate  tumor  metastatic  potential  score 


samplelD 

MPS 

cohort 

subgroup 

samplelD 

MPS 

cohort 

subgroup 

samplelD 

MPS 

cohort 

subgroup 

GSM525584 

1.76 

MSK 

PT  Control 

GSM525701 

1.43 

MSK 

PT  Control 

PT250 

1.12 

SU 

PT  Control 

GSM525585 

0.65 

MSK 

PT  Control 

GSM525702 

1.16 

MSK 

PT  Control 

PT265 

0.76 

SU 

PT  Control 

GSM525586 

0.52 

MSK 

PT  Control 

GSM525703 

1.10 

MSK 

PT  Control 

PT83 

0.54 

SU 

PT  Control 

GSM525587 

0.83 

MSK 

PT  Control 

GSM525704 

0.94 

MSK 

mPT 

PT87 

0.88 

SU 

PT  Control 

GSM525588 

1.14 

MSK 

PT  Control 

GSM525705 

1.11 

MSK 

PT  Control 

PT318 

0.44 

SU 

PT  Control 

GSM525589 

0.79 

MSK 

PT  Control 

GSM525706 

0.84 

MSK 

PT  Control 

PT96 

1.17 

SU 

PT  Control 

GSM525590 

1.07 

MSK 

PT  Control 

GSM525707 

1.32 

MSK 

PT  Control 

PT102 

0.77 

SU 

PT  Control 

GSM525591 

0.58 

MSK 

PT  Control 

GSM5257Q8 

0.71 

MSK 

PT  Control 

PL114 

0.60 

SU 

ln+ 

GSM525592 

1.36 

MSK 

PT  Control 

GSM525709 

1.55 

MSK 

PT  Control 

PL115 

1.11 

SU 

ln+ 

GSM525593 

1.23 

MSK 

PT  Control 

GSM525710 

1.47 

MSK 

PT  Control 

PL116 

1.43 

SU 

ln+ 

GSM525594 

1.30 

MSK 

PT  Control 

GSM525711 

1.07 

MSK 

PT  Control 

PT215 

0.38 

SU 

PT  Control 

GSM525595 

1.46 

MSK 

PT  Control 

GSM525712 

1.28 

MSK 

PT  Control 

PT205 

0.92 

SU 

PT  Control 

GSM525596 

0.78 

MSK 

PT  Control 

GSM525713 

0.87 

MSK 

ln+ 

PT335 

1.17 

SU 

PT  Control 

GSM525597 

134 

MSK 

PT  Control 

GSM525714 

1.92 

MSK 

mPT 

PT92 

0.72 

SU 

PT  Control 

GSM525598 

0.82 

MSK 

PT  Control 

GSM525715 

0.45 

MSK 

PT  Control 

PT168 

1.41 

SU 

PT  Control 

GSM525599 

0.76 

MSK 

PT  Control 

GSM525716 

0.32 

MSK 

PT  Control 

PT111 

0.87 

SU 

PT  Control 

GSM525600 

1.54 

MSK 

PT  Control 

GSM525717 

1.11 

MSK 

PT  Control 

PT112 

0.63 

SU 

PT  Control 

GSM525601 

0.76 

MSK 

PT  Control 

GSM525718 

0.70 

MSK 

PT  Control 

PT224 

0.99 

SU 

PT  Control 

GSM525602 

1.80 

MSK 

mPT 

GSM525719 

0.66 

MSK 

PT  Control 

PT229 

0.68 

SU 

PT  Control 

GSM525603 

1.11 

MSK 

PT  Control 

GSM525720 

0.70 

MSK 

PT  Control 

PT233 

0.53 

SU 

PT  Control 

GSM525604 

0.82 

MSK 

PT  Control 

GSM525721 

0.59 

MSK 

PT  Control 

PT19 

1.10 

SU 

PT  Control 

GSM525605 

1.04 

MSK 

mPT 

GSM525722 

0.84 

MSK 

PT  Control 

PT05 

0.70 

SU 

PT  Control 

GSM525606 

2.09 

MSK 

mPT 

GSM525723 

1.66 

MSK 

PT  Control 

PT07 

0.59 

SU 

PT  Control 

GSM525607 

1.01 

MSK 

PT  Control 

GSM525724 

1.46 

MSK 

PT  Control 

PT14 

0.68 

SU 

PT  Control 

GSM525608 

0.98 

MSK 

PT  Control 

GSM525725 

0.91 

MSK 

PT  Control 

PT103 

0.83 

SU 

PT  Control 

GSM525609 

1.19 

MSK 

PT  Control 

GSM525726 

0.59 

MSK 

PT  Control 

PT187 

0.67 

SU 

PT  Control 

GSM525610 

1.32 

MSK 

PT  Control 

GSM525727 

1.03 

MSK 

ln+ 

PT190 

0.82 

SU 

PT  Control 

GSM525611 

1.36 

MSK 

PT  Control 

GSM525728 

1.22 

MSK 

PT  Control 

PT191 

1.29 

SU 

PT  Control 

GSM525612 

1.24 

MSK 

PT  Control 

GSM525729 

1.48 

MSK 

PT  Control 

PT195 

0.79 

SU 

PT  Control 

GSM525613 

1.07 

MSK 

PT  Control 

GSM525730 

1.54 

MSK 

PT  Control 

PT126 

0.22 

SU 

PT  Control 

GSM525614 

121 

MSK 

PT  Control 

GSM525731 

1.15 

MSK 

PT  Control 

PT235 

0.75 

SU 

PT  Control 

GSM525615 

0.33 

MSK 

PT  Control 

GSM525732 

1.32 

MSK 

PT  Control 

PT28 

0.85 

SU 

PT  Control 

GSM525616 

1.59 

MSK 

mPT 

GSM525733 

0.66 

MSK 

mPT 

PT21 

0.50 

SU 

PT  Control 

GSM525617 

1.15 

MSK 

PT  Control 

GSM525734 

0.66 

MSK 

PT  Control 

PL27 

1.43 

SU 

ln+ 

GSM525618 

1.79 

MSK 

ln+ 

GSM525735 

1.51 

MSK 

PT  Control 

PL118 

0.47 

SU 

ln+ 

GSM525619 

1.46 

MSK 

mPT 

GSM525736 

1.12 

MSK 

PT  Control 

PL122 

1.59 

SU 

ln+ 

GSM525620 

1.16 

MSK 

PT  Control 

GSM525737 

1.12 

MSK 

PT  Control 

PL129 

1.17 

SU 

ln+ 

GSM525621 

0.77 

MSK 

PT  Control 

GSM525738 

1.06 

MSK 

PT  Control 

PL194 

1.59 

SU 

ln+ 

GSM525622 

1.70 

MSK 

PT  Control 

GSM525739 

1.19 

MSK 

PT  Control 

PT41 

0.98 

SU 

PT  Control 

GSM525623 

0.96 

MSK 

PT  Control 

GSM525740 

0.80 

MSK 

PT  Control 

GSM525624 

0.39 

MSK 

PT  Control 

GSM525741 

1.20 

MSK 

PT  Control 

GSM525625 

1.22 

MSK 

PT  Control 

GSM525742 

1.35 

MSK 

PT  Control 

GSM525626 

2.08 

MSK 

PT  Control 

GSM525743 

0.81 

MSK 

PT  Control 

GSM525627 

0.71 

MSK 

PT  Control 

GSM525744 

1.60 

MSK 

PT  Control 

GSM525628 

0.64 

MSK 

mPT 

GSM525745 

1.33 

MSK 

PT  Control 

GSM525629 

1.54 

MSK 

ln+ 

GSM525746 

1.26 

MSK 

PT  Control 

Table  4|  Cell  line  metastatic  potential  score 


samplelD 

MPS 

Cat.  No. 

Origin 

samplelD 

MPS 

Cat.  No. 

Origin 

SS493134 

1.78 

CCL-121 

Lung 

SS285150 

0.78 

HTB-43 

Lung 

SS493087 

1.76 

HTB-22 

Lung 

SS285196 

0.78 

CCL-218 

Liver 

SS356931 

1.69 

CCL-155 

Thyroid  gland 

SS356915 

0.78 

CRL-2320 

Pancreas 

SS356919 

1.66 

HTB-131 

Rectum 

SS285123 

0.77 

CRL-1611 

Hematopoietic  and  Lymphatic  System 

SS364381 

1.59 

CRL-1420 

Pancreas 

SS493085 

0.77 

HTB-44 

Lung 

SS493086 

1.58 

HTB-77 

Lung 

SS247758 

0.77 

CRL-2321 

Lung 

SS285144 

1.58 

CRL-5806 

Lung 

SS421711 

0.77 

CCL-113 

Hematopoietic  and  Lymphatic  System 

SS493106 

1.55 

CRL-2505 

Lung 

SS285185 

0.77 

CRL-2331 

Skin 

SS493131 

1.54 

ACC  298 

Lung 

SS356910 

0.76 

CRL-2336 

Pancreas 

SS320522 

1.52 

HTB-76 

Colon 

SS493071 

0.76 

TIB-180 

Lung 

SS493080 

1.52 

CRL-2289 

Lung 

SS351252 

0.76 

CRL-1620 

Colon 

SS320536 

1.50 

CCL-225 

Pharynx 

SS356921 

0.76 

CRL-2338 

Central  Nervous  System 

SS320523 

1.50 

HTB-64 

Colon 

SS351245 

0.76 

HTB-48 

Colon 

SS285160 

1.49 

CRL-1933 

Lung 

SS364373 

0.76 

ACC  325 

Bladder 

SS356911 

1.44 

HTB-79 

Skin 

SS247736 

0.76 

CL-188 

Hematopoietic  and  Lymphatic  System 

SS285181 

1.43 

CCL-138 

Skin 

SS285197 

0.76 

TIB-190 

Hematopoietic  and  Lymphatic  System 

SS285143 

1.42 

HTB-112 

Lung 

SS421685 

0.76 

CRL-2061 

Hematopoietic  and  Lymphatic  System 

SS493083 

1.42 

CRL-8083 

Lung 

SS493125 

0.75 

CCL-227 

Lung 

SS320542 

1.42 

CRL-1718 

Breast 

SS285094 

0.75 

HTB-103 

Central  Nervous  System 

SS285215 

1.40 

CRL-2064 

Breast 

SS351239 

0.75 

CRL-1739 

Stomach 

SS320532 

1.40 

HTB-32 

Kidney 

SS285133 

0.75 

HTB-16 

Hematopoietic  and  Lymphatic  System 

SS364371 

1.39 

CRL-2270 

Breast 

SS351251 

0.74 

CCL-228 

Brain 

SS356924 

1.38 

CRL-7898 

Thyroid  gland 

SS493135 

0.74 

CRL-5974 

Lung 

SS320538 

1.37 

HTB-31 

Breast 

SS493094 

0.73 

CRL-9591 

Lung 

SS285163 

1.37 

CCL-119 

Prostate 

SS351250 

0.73 

HTB-9 

Stomach 

SS356942 

1.37 

CRL-2062 

Breast 

SS364372 

0.73 

CCL-251 

Bladder 

SS320553 

1.35 

HB-8064 

Breast 

SS285225 

0.73 

CRL-2158 

Uterus 

SS320530 

1.35 

TIB-161 

Colon 

SS285131 

0.73 

CCL-235 

Bone 

SS285179 

1.35 

CRL-5868 

Bladder 

SS351249 

0.73 

CCL-252 

Muscle 

SS493073 

1.35 

CRL-1619 

Lung 

SS351235 

0.72 

CRL-2020 

Kidney 

SS493091 

1.34 

CRL-10741 

Lung 

SS285088 

0.72 

HTB-13 

Central  Nervous  System 

SS493075 

1.34 

CRL-9446 

Esophagus 

SS493093 

0.71 

ACC  7 

Lung 

...continued 

Table  4|  Cell  line  metastatic  potential 

score 

samplelD 

MPS 

Cat.  No. 

Origin 

samplelD 

MPS 

Cat.  No. 

Origin 

SS356925 

1.33 

CRL-2049 

Hematopoietic  and  Lymphatic  System 

SS285199 

0.71 

CRL-1473 

Bone 

SS320539 

1.33 

HB-8065 

Breast 

SS285226 

0.71 

HTB-12 

Lung 

SS285138 

1.31 

HTB-1 

Lung 

SS356914 

0.71 

CRL-2315 

Uterus 

SS285072 

1.31 

CRL-1976 

Hematopoietic  and  Lymphatic  System 

SS285177 

0.71 

CRL-1472 

Prostate 

SS421708 

1.30 

CRL-5819 

Hematopoietic  and  Lymphatic  System 

SS364368 

0.70 

CCL-234 

Hematopoietic  and  Lymphatic  System 

SS493081 

1.30 

CRL-1594 

Lung 

SS285119 

0.70 

ACC  448 

Hematopoietic  and  Lymphatic  System 

SS285115 

1.30 

CRL-2273 

Pancreas 

SS493103 

0.70 

CRL-5971 

Lung 

SS285137 

1.29 

CRL-7920 

Lung 

SS285203 

0.70 

ACC  29 

Bladder 

SS364370 

1.28 

CRL-2105 

Breast 

SS421693 

0.70 

HTB-19 

Hematopoietic  and  Lymphatic  System 

SS320541 

1.27 

HTB-144 

Breast 

SS493116 

0.70 

HTB-148 

Lung 

SS285109 

1.27 

CRL-2274 

Liver 

SS364379 

0.70 

ACC  413 

Central  Nervous  System 

SS320537 

1.26 

HTB-173 

Breast 

SS285206 

0.69 

93121056 

Vulva 

SS285142 

1.26 

CRL-1595 

Lung 

SS285176 

0.69 

HTB-80 

Uterus 

SS320524 

1.26 

CCL-224 

Colon 

SS247725 

0.68 

CRL-2268 

Connective  Tissue 

SS285161 

1.26 

CRL-2258 

Lung 

SS421705 

0.68 

CRL-1579 

Hematopoietic  and  Lymphatic  System 

SS285098 

1.25 

HTB-36 

Pharynx 

SS285099 

0.68 

CRL-1441 

Lung 

SS493097 

1.25 

CRL-1977 

Lung 

SS320512 

0.67 

HTB-82 

Lung 

SS285102 

1.23 

CRL-1598 

Cervix  Uteri 

SS351253 

0.66 

HTB-113 

Hematopoietic  and  Lymphatic  System 

SS356922 

1.23 

CRL-2220 

Pancreas 

SS421702 

0.66 

ACC  279 

Hematopoietic  and  Lymphatic  System 

SS285082 

1.22 

CCL-85 

Kidney 

SS493070 

0.66 

ACC  20 

Lung 

SS364375 

1.22 

CRL-9607 

Hematopoietic  and  Lymphatic  System 

SS285065 

0.66 

HTB-111 

Prostate 

SS493102 

1.21 

CRL-10423 

Lung 

SS285068 

0.66 

ACC  135 

Skin 

SS364369 

1.21 

CRL-1427 

Breast 

SS320507 

0.65 

CRL-1682 

Lung 

SS320544 

1.21 

CRL-5892 

Breast 

SS421699 

0.65 

ACC  198 

Hematopoietic  and  Lymphatic  System 

SS285172 

1.20 

CRL-2500 

Hematopoietic  and  Lymphatic  System 

SS285148 

0.64 

CRL-2236 

Lung 

SS285151 

1.20 

HTB-46 

Lung 

SS493100 

0.64 

ACC  360 

Lung 

SS493137 

1.20 

CCL-220.1 

Lung 

SS351241 

0.64 

ACC  15 

Stomach 

SS421690 

1.19 

CRL-1978 

Hematopoietic  and  Lymphatic  System 

SS356926 

0.63 

ACC  403 

Hematopoietic  and  Lymphatic  System 

SS285085 

1.18 

CRL-1543 

Ovary 

SS285219 

0.63 

ACC  365 

Skin 

SS285194 

1.18 

CCL-243 

Liver 

SS421709 

0.63 

CRL-2265 

Hematopoietic  and  Lymphatic  System 

SS493088 

1.18 

CRL-5804 

Lung 

SS285218 

0.63 

ACC  215 

Lung 

SS285186 

1.18 

CRL-2230 

Skin 

SS285146 

0.62 

CRL-5973 

Lung 

...continued 


samplelD  MPS  Cat.  No.  Origin _ 

SS493112  1.17  HTB-47  Lung 
SS285080  1.17  HTB-185  Cervix  Uteri 

SS285202  1.17  CRL-1440  Cervix  Uteri 

SS356916  1.16  CRL-2119  Liver 

SS493096  1.16  CRL-1545  Esophagus 

SS493095  1.16  CRL-8294  Esophagus 

SS493143  1.14  CRL-5915  Colon 

SS285120  1.13  CRL-2231  Bladder 

SS285100  1.11  HTB-55  Lung 

SS421716  1.11  HTB-187  Hematopoietic  and  Lymphatic  System 


SS493079 

1.11 

CRL-11351 

Lung 

SS493089 

1.11 

CRL-1997 

Lung 

SS285141 

1.10 

CRL-1622 

Lung 

SS285154 

1.10 

CRL-1582 

Lung 

SS285192 

1.10 

CRL-2277 

Liver 

SS247746 

1.10 

HTB-75 

Breast 

SS285092 

1.10 

CCL-213 

Colon 

SS320540 

1.09 

CRL-2324 

Breast 

SS247731 

1.09 

CRL-2260 

Breast 

SS351242 

1.08 

CRL-5922 

Bladder 

SS285113 

1.08 

CRL-5808 

Skin 

SS356907 

1.08 

HTB-175 

Bladder 

SS285209 

1.07 

HTB-69 

Colon 

SS285164 

1.06 

CRL-1647 

Lung 

SS285170 

1.06 

CRL-7763 

Ovary 

SS285214 

1.06 

HTB-172 

Breast 

SS351246 

1.05 

CRL-11609 

Colon 

SS285118 

1.05 

CRL-2137 

Cervix  Uteri 

SS356933 

1.05 

HTB-94 

Bladder 

SS356928 

1.04 

TIB-202 

Thyroid  gland 

SS285162 

1.04 

CRL-5985 

Rectum 

SS285101 

1.04 

CRL-11732 

Lung 

ic  potential  score 


samplelD 

MPS 

Cat.  No. 

Origin 

SS285175 

0 

.62 

ACC  131 

Hematopoietic  and  Lymphatic  System 

SS421687 

0 

.61 

ACC  87 

Hematopoietic  and  Lymphatic  System 

SS320509 

0 

.61 

ACC  277 

Hematopoietic  and  Lymphatic  System 

SS356940 

0 

.61 

ACC  231 

Hematopoietic  and  Lymphatic  System 

SS285211 

0 

.61 

ACC  143 

Hematopoietic  and  Lymphatic  System 

SS285217 

0 

.61 

ACC  427 

Eye 

SS356941 

0 

.61 

CRL-2340 

Hematopoietic  and  Lymphatic  System 

SS285208 

0 

.60 

ACC  361 

Synovial  Membrane 

SS285159 

0 

.60 

ACC  317 

Lung 

SS421724 

0 

.60 

ACC  48 

Colon 

SS421712 

0 

.59 

ACC  414 

Hematopoietic  and  Lymphatic  System 

SS421719 

0 

.59 

ACC  382 

Hematopoietic  and  Lymphatic  System 

SS356912 

0 

.59 

CRL-1552 

Uterus 

SS356932 

0 

.59 

CRL-2625 

Hematopoietic  and  Lymphatic  System 

SS364376 

0 

.58 

ACC  548 

Kidney 

SS421695 

0 

.58 

ACC  128 

Hematopoietic  and  Lymphatic  System 

SS285105 

0 

.57 

ACC  18 

Kidney 

SS421703 

0 

.57 

ACC  47 

Hematopoietic  and  Lymphatic  System 

SS356929 

0 

.56 

ACC  399 

Central  Nervous  System 

SS493114 

0 

.56 

ACC  378 

Lung 

SS285174 

0 

.55 

ACC  346 

Bone 

SS421713 

0 

.55 

CRL-1484 

Hematopoietic  and  Lymphatic  System 

SS285155 

0 

.54 

CCL-87 

Lung 

SS364366 

0 

.54 

CRL-2392 

Hematopoietic  and  Lymphatic  System 

SS285183 

0 

.54 

CRL-2631 

Cervix  Uteri 

SS320525 

0 

.54 

ACC  526 

Colon 

SS285074 

0 

.54 

CCL-248 

Hematopoietic  and  Lymphatic  System 

SS421691 

0 

.54 

CCL-246 

Hematopoietic  and  Lymphatic  System 

SS285126 

0 

.53 

CRL-7779 

Uterus 

SS421706 

0 

.53 

ACC  354 

Hematopoietic  and  Lymphatic  System 

SS421707 

0 

.52 

ACC  572 

Hematopoietic  and  Lymphatic  System 

SS285130 

0 

.51 

ACC  576 

Uterus 

...continued 


samplelD 

MPS 

Cat.  No. 

Origin 

SS285190 

1.04 

CRL-2149 

Hematopoietic  and  Lymphatic  System 

SS285087 

1.03 

CRL-2172 

Pancreas 

SS493104 

1.03 

CRL-1803 

Lung 

SS493099 

1.03 

CRL-5928 

Lung 

SS285205 

1.02 

HTB-182 

Lung 

SS285090 

1.01 

HTB-161 

Skin 

SS285066 

1.00 

HTB-3 

Kidney 

SS493136 

1.00 

CRL-2142 

Lung 

SS285067 

1.00 

HTB-91 

Skin 

SS320511 

1.00 

TIB-196 

Skin 

SS493119 

1.00 

CRL-5929 

Lung 

SS351247 

0.99 

CRL-5810 

Central  Nervous  System 

SS493092 

0.99 

HTB-62 

Lung 

SS421718 

0.98 

CRL-8033-1 

Hematopoietic  and  Lymphatic  System 

SS285153 

0.98 

CRL-10302 

Brain 

SS493074 

0.98 

CRL-5931 

Lung 

SS493072 

0.98 

CRL-5811 

Lung 

SS285227 

0.98 

CRL-7724 

Uterus 

SS356906 

0.97 

HTB-114 

Bladder 

SS285193 

0.97 

CRL-2169 

Liver 

SS285158 

0.97 

CRL-1897 

Lung 

SS493113 

0.96 

CRL-5826 

Lung 

SS364374 

0.94 

HTB-178 

Breast 

SS421700 

0.94 

CCL-86 

Hematopoietic  and  Lymphatic  System 

SS285216 

0.94 

CRL-2195 

Bladder 

SS285079 

0.93 

CRL-2235 

Cervix  Uteri 

SS493108 

0.93 

HTB-92 

Lung 

SS493123 

0.93 

CRL-1902 

Lung 

SS285149 

0.93 

CRL-5800 

Lung 

SS285212 

0.92 

CRL-5833 

Thyroid  gland 

SS364367 

0.92 

CCL-136 

Stomach 

SS493082 

0.92 

HTB-35 

Lung 

ic  potential  score 


samplelD 

SS285167 

SS285112 

SS351237 

SS285171 

SS285091 

SS421694 

SS285191 

SS285124 

SS285073 

SS493090 

SS421704 

SS493121 

SS285089 

SS285198 

SS421710 

SS285070 

SS285077 

SS285189 

SS285103 

SS320508 

SS493139 

SS364378 

SS285165 

SS356908 

SS351238 

SS285093 

SS285097 

SS421714 

SS285210 

SS356934 

SS320548 

SS320520 


MPS  Cat.  No.  Origin 


.51 

ACC  546 

Lung 

.51 

HTB-60 

Hematopoietic  and  Lymphatic  System 

.51 

ACC  497 

Hematopoietic  and  Lymphatic  System 

.50 

CRL-2630 

Hematopoietic  and  Lymphatic  System 

.50 

CRL-1432 

Brain 

.49 

CRL-2740 

Hematopoietic  and  Lymphatic  System 

.49 

ACC  197 

Uterus 

.49 

ACC  571 

Hematopoietic  and  Lymphatic  System 

.49 

ACC  577 

Hematopoietic  and  Lymphatic  System 

.48 

HTB-61 

Lung 

.47 

ACC  139 

Hematopoietic  and  Lymphatic  System 

.46 

CRL-8119 

Lung 

.45 

CRL-2632 

Hematopoietic  and  Lymphatic  System 

.45 

CRL-2021 

Hematopoietic  and  Lymphatic  System 

.44 

CRL-1648 

Hematopoietic  and  Lymphatic  System 

.44 

CRL-8119 

Muscle 

.44 

CRL-1649 

Cervix  Uteri 

.44 

ACC  584 

Central  Nervous  System 

.43 

CCL-214 

Hematopoietic  and  Lymphatic  System 

.43 

CRL-5818 

Kidney 

.43 

CRL-5920 

Bone 

.43 

HTB-58 

Hematopoietic  and  Lymphatic  System 

.43 

CRL-5906 

Lung 

.42 

92031919  Stomach 

.42 

CRL-5883 

Colon 

.42 

96071721 

Colon 

.42 

CRL-5896 

Hematopoietic  and  Lymphatic  System 

.42 

CRL-5983 

Hematopoietic  and  Lymphatic  System 

.42 

CRL-5881 

Connective  and  Soft  Tissue 

.42 

CRL-2578 

Brain 

.41 

HTB-56 

Hematopoietic  and  Lymphatic  System 

.40 

96070808 

Hematopoietic  and  Lymphatic  System 

...continued  Table  4|  Cell  line  metastatic  potential  score 


samplelD 

MPS 

Cat.  No. 

Origin 

samplelD 

MPS 

Cat.  No. 

Origin 

SS285145 

0.91 

CRL-2237 

Lung 

SS421715 

0.40 

ACC  351 

Hematopoietic  and  Lymphatic  System 

SS356909 

0.91 

HTB-59 

Breast 

SS285132 

0.40 

CRL-5879 

Hematopoietic  and  Lymphatic  System 

SS285204 

0.91 

CRL-1749 

Connective  Tissue 

SS285114 

0.40 

CCL-256 

Sarcoma 

SS364377 

0.91 

CRL-5813 

Hematopoietic  and  Lymphatic  System 

SS421701 

0.40 

CRL-5889 

Hematopoietic  and  Lymphatic  System 

SS285106 

0.91 

HTB-166 

Breast 

SS285095 

0.40 

CRL-5899 

Brain 

SS421720 

0.90 

CRL-2233 

Hematopoietic  and  Lymphatic  System 

SS285184 

0.40 

CRL-5893 

Vulva 

SS285083 

0.90 

HTB-117 

Kidney 

SS285117 

0.38 

CRL-5841 

Hematopoietic  and  Lymphatic  System 

SS421696 

0.90 

HTB-169 

Hematopoietic  and  Lymphatic  System 

SS351244 

0.38 

HTB-171 

Colon 

SS285139 

0.90 

CRL-2128 

Lung 

SS285147 

0.37 

CRL-5942 

Lung 

SS285086 

0.89 

HTB-183 

Ovary 

SS285096 

0.36 

CRL-5844 

Hematopoietic  and  Lymphatic  System 

SS356927 

0.89 

HTB-88 

Ovary 

SS247756 

0.36 

CRL-5855 

Ovary 

SS285116 

0.89 

CRL-2238 

Pancreas 

SS320533 

0.36 

CRL-5885 

Placenta 

SS364365 

0.88 

HTB-118 

Breast 

SS285122 

0.36 

96062201 

Placenta 

SS285127 

0.88 

CCL-75 

Skin 

SS421697 

0.35 

HTB-174 

Hematopoietic  and  Lymphatic  System 

SS356918 

0.87 

HTB-119 

Central  Nervous  System 

SS320531 

0.35 

CRL-5835 

Brain 

SS285075 

0.87 

CRL-2261 

Prostate 

SS285220 

0.34 

CRL-5888 

Eye 

SS493078 

0.87 

HTB-67 

Lung 

SS421686 

0.33 

CRL-5831 

Hematopoietic  and  Lymphatic  System 

SS493107 

0.86 

CRL-2234 

Lung 

SS421723 

0.33 

CRL-5878 

Hematopoietic  and  Lymphatic  System 

SS285104 

0.85 

HTB-93 

Ovary 

SS356935 

0.32 

CRL-5877 

Muscle 

SS285200 

0.85 

CRL-1675 

Pancreas 

SS320545 

0.31 

95062830 

Hematopoietic  and  Lymphatic  System 

SS285071 

0.84 

CRL-5807 

Skin 

SS285084 

0.31 

CRL-5816 

Lung 

SS285207 

0.84 

CRL-1671 

Vulva 

SS320513 

0.31 

CRL-5853 

Colon 

SS285129 

0.84 

CRL-2262 

Central  Nervous  System 

SS285078 

0.31 

CRL-2170 

Liver 

SS356917 

0.84 

CCL-237 

Central  Nervous  System 

SS493110 

0.31 

96020724 

Lung 

SS320514 

0.84 

HTB-18 

Hematopoietic  and  Lymphatic  System 

SS421688 

0.30 

CRL-5914 

Hematopoietic  and  Lymphatic  System 

SS285173 

0.83 

CCL-233 

Muscle 

SS356923 

0.30 

92031917 

Kidney 

SS493098 

0.83 

TIB-153 

Lung 

SS421721 

0.30 

CRL-5865 

Hematopoietic  and  Lymphatic  System 

SS493109 

0.83 

CRL-2343 

Lung 

SS285188 

0.29 

CRL-5895 

Brain 

SS285187 

0.82 

CRL-1974 

Central  Nervous  System 

SS285182 

0.27 

CRL-5909 

Lung 

SS493084 

0.81 

CRL-2314 

Esophagus 

SS285107 

0.26 

HTB-54 

Colon 

SS285111 

0.81 

CRL-1621 

Hematopoietic  and  Lymphatic  System 

SS285121 

0.25 

CRL-5908 

Placenta 

SS421722 

0.81 

CCL-230 

Hematopoietic  and  Lymphatic  System 

SS351243 

0.24 

CRL-2066 

Colon 

...continued 

Table  4|  Cell  line  metastatic  potential 

score 

samplelD 

MPS 

Cat.  No. 

Origin 

samplelD 

MPS 

Cat.  No. 

Origin 

SS285169 

0.81 

CCL-98 

Ovary 

SS320550 

0.23 

CRL-5838 

Hematopoietic  and  Lymphatic  System 

SS421717 

0.80 

TIB-223 

Hematopoietic  and  Lymphatic  System 

SS351236 

0.20 

CRL-2098 

Colon 

SS285195 

0.79 

CRL-8644 

Liver 

SS247755 

0.19 

CRL-5884 

Pancreas 

SS421689 

0.79 

ACC  3 

Hematopoietic  and  Lymphatic  System 

SS285081 

0.17 

CRL-5872 

Prostate 

SS285213 

0.79 

HTB-53 

Hematopoietic  and  Lymphatic  System 

SS285128 

0.17 

CRL-5871 

Hematopoietic  and  Lymphatic  System 

SS421692 

0.79 

CCL-244 

Hematopoietic  and  Lymphatic  System 

SS285125 

0.12 

92031918 

Sarcoma 

SS364380 

0.79 

CCL-238 

Prostate 

SS285201 

0.08 

CRL-5911 

Connective  Tissue 

SS493077 

0.79 

HTB-25 

Lung 

SS285110 

0.00 

CRL-5935 

Liver 

SS285108 

0.79 

CCL-231 

Uterus 

Table  5.  Model  predictions  achieved  with  a  range  of  genes. 


Genes 

r2 

auc 

topi  2 

0.69 

0.77 

top20 

0.78 

0.81 

top40 

0.89 

0.85 

top80 

0.94 

0.82 

topi  00 

0.94 

0.82 

Table  6  (la) 


Final- 

RANK 

gene 

index 

NYU-Z 

NYU- 

dir 

NYU- 

count 

MSKsl- 

Z 

MSKsl- 

dir 

MSKsl- 

count 

MSKs2- 

Z 

MSKs2- 

dir 

MSKs2- 

count 

logrank- 

n52random 

logrank- 
n271  random 

logrank- 

composite 

gene-Chr 

gene- 

Cytoband 

1 

PPP3CC 

129 

3.1 

-1 

958 

2.6 

-1 

965 

NA 

NA 

NA 

48 

41 

45 

8 

p21 .3 

2 

SLC05A1 

167 

4.9 

1 

1000 

4.2 

1 

982 

NA 

NA 

NA 

31 

13 

19 

8 

ql  3.3 

3 

SLC7A5 

312 

1.7 

-1 

508 

3 

-1 

980 

NA 

NA 

NA 

43 

37 

40 

16 

q24.2 

4 

SLC7A2 

110 

4.1 

-1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

44 

43 

44 

8 

p22 

5 

CRISPLD2 

299 

2.5 

-1 

735 

2.9 

-1 

939 

NA 

NA 

NA 

54 

67 

61 

16 

q24.1 

6 

CDH13 

288 

8 

-1 

984 

2.9 

-1 

767 

NA 

NA 

NA 

46 

86 

63 

16 

q23.3 

7 

CDH8 

265 

NA 

NA 

NA 

NA 

NA 

NA 

3.7344 

-1 

989 

15 

10 

11 

16 

q21 

8 

CDH2 

349 

NA 

NA 

NA 

NA 

NA 

NA 

3.4466 

-1 

987 

16 

15 

17 

18 

ql  2.1 

9 

ASAH1 

114 

7.1 

-1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

105 

64 

80 

8 

p22 

10 

KCNB2 

175 

6.8 

1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

59 

74 

66 

8 

ql  3.3 

11 

KCNH4 

343 

NA 

NA 

NA 

NA 

NA 

NA 

3.7501 

1 

983 

1 

1 

1 

17 

q21 .2 

12 

KCTD8 

21 

NA 

NA 

NA 

NA 

NA 

NA 

2.8192 

-1 

921 

30 

24 

29 

4 

pi  3 

13 

JPH1 

179 

6 

1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

29 

35 

31 

8 

q21 .1 1 

14 

MEST 

88 

NA 

NA 

NA 

NA 

NA 

NA 

3.2232 

1 

940 

32 

32 

32 

7 

q32.2 

15 

NCALD 

207 

5.5 

1 

1000 

2.9 

1 

953 

NA 

NA 

NA 

13 

12 

13 

8 

q22.3 

16 

COL19A1 

39 

NA 

NA 

NA 

NA 

NA 

NA 

3.4333 

-1 

936 

27 

20 

21.5 

6 

ql  3 

17 

MAP3K7 

43 

NA 

NA 

NA 

NA 

NA 

NA 

3.1873 

-1 

929 

47 

54 

49 

6 

ql  5 

18 

YWHAG 

67 

NA 

NA 

NA 

NA 

NA 

NA 

2.7386 

1 

951 

40 

62 

47 

7 

ql  1 .23 

19 

NOL4 

350 

NA 

NA 

NA 

NA 

NA 

NA 

3.9113 

-1 

993 

4 

2 

2 

18 

ql  2.1 

20 

ENOX1 

247 

NA 

NA 

NA 

NA 

NA 

NA 

5.6235 

-1 

1000 

2 

8 

4 

13 

ql  4.1 1 

21 

CSMD1 

94 

NA 

NA 

NA 

NA 

NA 

NA 

4.6280 

-1 

971 

7 

6 

6 

8 

p23.2 

22 

SGCZ 

107 

4.7 

-1 

926 

NA 

NA 

NA 

3.5107 

-1 

861 

9 

5 

7 

8 

p22 

23 

PDE10A 

54 

NA 

NA 

NA 

NA 

NA 

NA 

4.6945 

-1 

999 

8 

7 

8 

6 

q27 

24 

PCDH9 

252 

NA 

NA 

NA 

NA 

NA 

NA 

4.5416 

-1 

962 

5 

19 

9 

13 

q21 .32 

25 

HTR2A 

250 

NA 

NA 

NA 

NA 

NA 

NA 

3.2974 

-1 

966 

10 

11 

10 

13 

q14.2 

26 

HIP1 

63 

NA 

NA 

NA 

NA 

NA 

NA 

4.4416 

1 

1000 

11 

14 

12 

7 

ql  1 .23 

27 

CD226 

354 

NA 

NA 

NA 

NA 

NA 

NA 

3.3032 

-1 

1000 

18 

9 

14 

18 

q22.2 

28 

DCC 

352 

NA 

NA 

NA 

NA 

NA 

NA 

6.6211 

-1 

1000 

12 

17 

15 

18 

q21 .2 

29 

CC2D1A 

357 

NA 

NA 

NA 

NA 

NA 

NA 

3.9705 

1 

996 

17 

18 

18 

19 

pi  3.1 2 

30 

PTK2B 

152 

7 

-1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

20 

27 

21.5 

8 

p21 .2 

31 

BCM01 

284 

2.9 

-1 

943 

3.6 

-1 

957 

NA 

NA 

NA 

26 

21 

23 

16 

q23.2 

32 

MACROD1 

238 

NA 

NA 

NA 

1.9 

1 

533 

2.8909 

1 

973 

25 

22 

24 

11 

ql  3. 1 

33 

GRID2 

24 

NA 

NA 

NA 

NA 

NA 

NA 

5.1103 

-1 

983 

22 

26 

25 

4 

q22.1 

34 

DIAPH3 

251 

NA 

NA 

NA 

NA 

NA 

NA 

3.2653 

-1 

982 

24 

29 

27 

13 

q21 .2 

35 

PILRB 

69 

NA 

NA 

NA 

NA 

NA 

NA 

2.9352 

1 

996 

28 

25 

28 

7 

q22.1 

36 

MEIS2 

259 

NA 

NA 

NA 

NA 

NA 

NA 

3.9428 

-1 

999 

19 

39 

30 

15 

q14 

37 

MSRA 

98 

5.1 

-1 

999 

NA 

NA 

NA 

NA 

NA 

NA 

34 

31 

33 

8 

p23.1 

38 

DPYD 

4 

NA 

NA 

NA 

NA 

NA 

NA 

2.8861 

-1 

847 

33 

34 

34 

1 

p21 .3 

Table  6  (lb) 


Final- 

RANK 

gene 

gene-start 

gene-end 

genesBtwn 

contig 

clump- 

index 

dist-prev 

dist-next 

min-dist-to-RGL 

IndexO- 

Proxyl 

NYU- 

Zadjust 

MSKsl- 

Zadjust 

MSKs2- 

Zadjust 

1 

PPP3CC 

22354541 

22454580 

0 

1 

26 

10616 

-7079 

-7079 

1 

0.52 

0.29 

NA 

2 

SLC05A1 

70747129 

70909762 

0 

1 

33 

216812 

-11428 

-11428 

1 

1.63 

1.16 

NA 

3 

SLC7A5 

86421131 

8646061 5 

0 

1 

58 

18511 

-64075 

18511 

1 

0.00 

0.47 

NA 

4 

SLC7A2 

17398975 

17472357 

0 

1 

21 

6086 

-83768 

6086 

1 

1.10 

NA 

NA 

5 

CRISPLD2 

83411113 

83500614 

0 

1 

56 

64959 

-40087 

-40087 

1 

0.25 

0.42 

NA 

6 

CDH13 

81439761 

82387705 

1 

0 

NA 

102526 

-750123 

102526 

1 

3.67 

0.42 

NA 

7 

CDH8 

60244866 

60628240 

82 

0 

NA 

7528258 

-3524069 

-3524069 

1 

NA 

NA 

0.87 

8 

CDH2 

23784934 

24011189 

19 

0 

NA 

5673873 

NA 

5673873 

1 

NA 

NA 

0.70 

9 

ASAH1 

17958214 

17986787 

1 

1 

22 

306248 

-22652 

-22652 

1 

3.10 

NA 

NA 

10 

KCNB2 

73642524 

74012880 

1 

1 

34 

352193 

-492151 

352193 

1 

2.91 

NA 

NA 

11 

KCNH4 

37562439 

37586822 

1 

1 

64 

7810 

-1891 

-1891 

1 

NA 

NA 

0.88 

12 

KCTD8 

43870683 

44145581 

3 

0 

NA 

1 800760 

-30632257 

1800760 

1 

NA 

NA 

0.38 

13 

JPH1 

75309493 

75396117 

0 

1 

35 

29056 

-262534 

29056 

1 

2.37 

NA 

NA 

14 

MEST 

129913282 

129933363 

0 

1 

13 

41 

-45149 

41 

1 

NA 

NA 

0.58 

15 

NCALD 

102767947 

103206311 

1 

1 

40 

128437 

-16952 

-16952 

1 

2.03 

0.42 

NA 

16 

COL19A1 

70633169 

70978878 

20 

0 

NA 

4871884 

NA 

4871884 

1 

NA 

NA 

0.70 

17 

MAP3K7 

91282074 

91353628 

0 

1 

5 

2654236 

-7084576 

2654236 

1 

NA 

NA 

0.56 

18 

YWHAG 

75794053 

75826252 

126 

0 

NA 

23787222 

-260189 

-260189 

1 

NA 

NA 

0.34 

19 

NOL4 

29685062 

30057513 

0 

1 

67 

269766 

-5673873 

269766 

1 

NA 

NA 

0.98 

20 

ENOX1 

42685704 

43259044 

18 

0 

NA 

2766260 

-4175452 

2766260 

1 

NA 

NA 

2.12 

21 

CSMD1 

2780282 

3258996 

46 

1 

14 

5420413 

-699503 

-699503 

1 

NA 

NA 

1.44 

22 

SGCZ 

13991744 

15140219 

0 

1 

20 

301882 

-574978 

301882 

1 

1.49 

NA 

0.74 

23 

PDE10A 

165660766 

1 65995578 

NA 

1 

8 

NA 

-17665 

-17665 

1 

NA 

NA 

1.49 

24 

PCDH9 

65774970 

66702578 

0 

1 

49 

2470149 

-6138850 

2470149 

1 

NA 

NA 

1.38 

25 

HTR2A 

46305514 

46368176 

44 

1 

48 

1 2769542 

-36146 

-36146 

1 

NA 

NA 

0.62 

26 

HIP1 

75001345 

75206215 

5 

0 

NA 

248023 

-1543149 

248023 

1 

NA 

NA 

1.32 

27 

CD226 

65681175 

65775140 

NA 

0 

NA 

NA 

-12833135 

-12833135 

1 

NA 

NA 

0.63 

28 

DCC 

48121156 

49311780 

10 

0 

NA 

3157834 

-17395350 

3157834 

1 

NA 

NA 

2.79 

29 

CC2D1A 

13878014 

13902691 

1 

1 

68 

30662 

-105 

-105 

1 

NA 

NA 

1.02 

30 

PTK2B 

27224916 

27372820 

0 

1 

30 

376 

-165 

-165 

1 

3.04 

NA 

NA 

31 

BCM01 

79829797 

79882248 

0 

1 

53 

23828 

-18320 

-18320 

1 

0.42 

0.80 

NA 

32 

MACROD1 

63522607 

63690109 

1 

0 

NA 

19764 

-81715 

19764 

1 

NA 

0.05 

0.41 

33 

GRID2 

93444831 

94914730 

186 

0 

NA 

60460408 

-30824069 

-30824069 

1 

NA 

NA 

1.77 

34 

DIAPH3 

59137718 

59636120 

2 

0 

NA 

6138850 

-12769542 

6138850 

1 

NA 

NA 

0.60 

35 

PILRB 

99771673 

99803388 

0 

1 

11 

5616 

-111895 

5616 

1 

NA 

NA 

0.44 

36 

MEIS2 

34970519 

35189740 

193 

0 

NA 

24742144 

NA 

24742144 

1 

NA 

NA 

1.00 

37 

MSRA 

9949189 

10323803 

4 

1 

16 

697587 

-271923 

-271923 

1 

1.76 

NA 

NA 

38 

DPYD 

97315890 

98159203 

19 

0 

NA 

4955408 

-79289745 

4955408 

1 

NA 

NA 

0.41 

Table  6  (2a) 


39 

ANKRD11 

329 

3 

-1 

948 

3.7 

-1 

988 

NA 

NA 

NA 

37 

33 

35 

16 

q24.3 

40 

NRXN1 

6 

NA 

NA 

NA 

NA 

NA 

NA 

3.2327 

-1 

840 

39 

38 

38 

2 

pi  6.3 

41 

ADCY8 

225 

3.1 

1 

980 

5.4 

1 

1000 

NA 

NA 

NA 

52 

30 

39 

8 

q24.22 

42 

TRDN 

49 

NA 

NA 

NA 

NA 

NA 

NA 

3.0342 

-1 

898 

38 

44 

41 

6 

q22.31 

43 

STAU2 

177 

4.6 

1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

45 

42 

43 

8 

q21 .1 1 

44 

SF1 

240 

NA 

NA 

NA 

NA 

NA 

NA 

2.4710 

1 

886 

55 

46 

48 

11 

ql  3.1 

45 

CLIP2 

62 

NA 

NA 

NA 

NA 

NA 

NA 

3.0945 

1 

998 

57 

47 

50 

7 

ql  1 .23 

46 

CLDN3 

58 

NA 

NA 

NA 

NA 

NA 

NA 

2.6179 

1 

984 

51 

53 

51 

7 

ql  1 .23 

47 

ZSWIM4 

355 

NA 

NA 

NA 

NA 

NA 

NA 

2.8120 

1 

975 

60 

51 

57 

19 

pi  3. 1 3 

48 

GLRB 

26 

NA 

NA 

NA 

NA 

NA 

NA 

2.6600 

-1 

963 

64 

48 

58 

4 

q32.1 

49 

DCHS2 

25 

NA 

NA 

NA 

NA 

NA 

NA 

2.7883 

-1 

954 

68 

60 

64 

4 

q32.1 

50 

TRPS1 

217 

2.9 

1 

814 

2.7 

1 

751 

NA 

NA 

NA 

63 

65 

65 

8 

q23.3 

51 

MDGA2 

258 

NA 

NA 

NA 

NA 

NA 

NA 

2.8345 

-1 

823 

69 

66 

68 

14 

q21 .3 

52 

CNBD1 

193 

3.8 

1 

999 

3.8 

1 

940 

NA 

NA 

NA 

67 

70 

69 

8 

q21 .3 

53 

STAG3 

68 

NA 

NA 

NA 

NA 

NA 

NA 

2.4187 

1 

967 

78 

68 

71 

7 

q22.1 

54 

GATA4 

102 

3.2 

-1 

979 

NA 

NA 

NA 

NA 

NA 

NA 

72 

77 

72 

8 

p23.1 

55 

VPS13B 

202 

3.9 

1 

999 

NA 

NA 

NA 

NA 

NA 

NA 

85 

69 

74 

8 

q22.2 

56 

DOCK5 

144 

5.4 

-1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

81 

78 

76 

8 

p21 .2 

57 

ZHX2 

218 

NA 

NA 

NA 

2.6 

1 

771 

NA 

NA 

NA 

82 

80 

78 

8 

q24.13 

58 

ARHGEF5 

90 

NA 

NA 

NA 

NA 

NA 

NA 

2.7472 

1 

760 

66 

102 

81 

7 

q35 

59 

SDC2 

198 

3.4 

1 

991 

NA 

NA 

NA 

NA 

NA 

NA 

75 

90 

82 

8 

q22.1 

60 

MYLK 

10 

NA 

NA 

NA 

2.8 

1 

842 

NA 

NA 

NA 

93 

75 

83 

3 

q21 .1 

61 

LPHN3 

23 

NA 

NA 

NA 

NA 

NA 

NA 

2.4806 

-1 

794 

80 

92 

85 

4 

ql  3.1 

62 

MOSPD3 

78 

NA 

NA 

NA 

NA 

NA 

NA 

2.3144 

1 

904 

90 

82 

86 

7 

q22.1 

63 

GYS2 

244 

NA 

NA 

NA 

NA 

NA 

NA 

2.7616 

-1 

884 

99 

83 

92 

12 

pi  2. 1 

64 

GAS8 

336 

NA 

NA 

NA 

2.9 

-1 

999 

NA 

NA 

NA 

84 

103 

95 

16 

q24.3 

65 

RAB9A 

362 

NA 

NA 

NA 

3.7 

1 

870 

NA 

NA 

NA 

98 

97 

97 

23 

p22.2 

66 

POLR3D 

127 

NA 

NA 

NA 

2.7 

-1 

955 

NA 

NA 

NA 

91 

109 

98 

8 

p21 .3 

67 

PSD3 

116 

7.3 

-1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

97 

104 

100 

8 

p22 

68 

ZFPM2 

213 

4.2 

1 

991 

6.3 

1 

996 

NA 

NA 

NA 

149 

71 

101 

8 

q23.1 

69 

ATP6V1C1 

209 

NA 

NA 

NA 

2.4 

1 

858 

NA 

NA 

NA 

114 

93 

102 

8 

q22.3 

70 

MEF2C 

36 

NA 

NA 

NA 

NA 

NA 

NA 

2.2584 

-1 

839 

109 

98 

103 

5 

q14.3 

71 

PKIA 

185 

3.3 

1 

999 

NA 

NA 

NA 

NA 

NA 

NA 

115 

99 

104 

8 

q21 .12 

72 

ADAMTS18 

276 

3.5 

-1 

902 

NA 

NA 

NA 

NA 

NA 

NA 

100 

114 

105 

16 

q23.1 

73 

STYXL1 

65 

NA 

NA 

NA 

NA 

NA 

NA 

2.3049 

1 

863 

104 

110 

106 

7 

ql  1 .23 

74 

EPM2A 

51 

NA 

NA 

NA 

NA 

NA 

NA 

2.3972 

-1 

920 

113 

105 

108 

6 

q24.3 

75 

LEPREL1 

19 

NA 

NA 

NA 

2.6 

1 

755 

NA 

NA 

NA 

106 

119 

110 

3 

q28 

76 

GABRA2 

22 

NA 

NA 

NA 

NA 

NA 

NA 

2.2755 

-1 

876 

119 

107 

111 

4 

pi  2 

77 

RCOR2 

237 

NA 

NA 

NA 

NA 

NA 

NA 

1.7131 

1 

514 

108 

120 

114 

11 

ql  3.1 

78 

MFHAS1 

95 

3.3 

-1 

956 

NA 

NA 

NA 

NA 

NA 

NA 

121 

108 

115 

8 

p23.1 

Table  6  (2b) 


39 

ANKRD11 

87861536 

88084470 

11 

1 

61 

246990 

-72136 

-72136 

1 

0.47 

0.85 

NA 

40 

NRXN1 

49999148 

51113178 

155 

0 

NA 

28619013 

NA 

28619013 

1 

NA 

NA 

0.59 

41 

ADCY8 

131861736 

132123854 

0 

1 

45 

861663 

-378337 

-378337 

1 

0.52 

1.97 

NA 

42 

TRDN 

123579182 

123999937 

96 

0 

NA 

20654629 

-5440605 

-5440605 

1 

NA 

NA 

0.48 

43 

STAU2 

74495160 

74821629 

1 

0 

NA 

199555 

-119303 

-119303 

1 

1.43 

NA 

NA 

44 

SF1 

64288654 

6430281 7 

1 

0 

NA 

24747 

-560058 

24747 

1 

NA 

NA 

0.23 

45 

CLIP2 

73341739 

73458196 

15 

1 

9 

1543149 

-35065 

-35065 

1 

NA 

NA 

0.52 

46 

CLDN3 

72821263 

72822536 

5 

0 

NA 

404089 

-49338 

-49338 

1 

NA 

NA 

0.29 

47 

ZSWIM4 

13767274 

13804044 

1 

0 

NA 

50124 

NA 

50124 

1 

NA 

NA 

0.38 

48 

GLRB 

158216788 

158312299 

0 

1 

3 

48887 

-2584470 

48887 

1 

NA 

NA 

0.31 

49 

DCHS2 

155375138 

155632318 

14 

0 

NA 

2584470 

-60460408 

2584470 

1 

NA 

NA 

0.37 

50 

TRPS1 

116489900 

116750429 

20 

1 

43 

7112653 

-1971482 

-1971482 

1 

0.42 

0.33 

NA 

51 

MDGA2 

46379045 

47213703 

NA 

NA 

NA 

NA 

NA 

NA 

1 

NA 

NA 

0.39 

52 

CNBD1 

87947840 

88435220 

1 

1 

38 

683360 

-122823 

-122823 

1 

0.91 

0.91 

NA 

53 

STAG3 

99613474 

99659778 

2 

0 

NA 

111895 

-23787222 

111895 

1 

NA 

NA 

0.21 

54 

GATA4 

11599162 

11654918 

0 

1 

18 

9709 

-139646 

9709 

1 

0.57 

NA 

NA 

55 

VPS13B 

100094670 

1 00958983 

1 

0 

NA 

83469 

-187596 

83469 

1 

0.98 

NA 

NA 

56 

DOCK5 

25098204 

25326536 

2 

0 

NA 

14747 

-1478148 

14747 

1 

1.97 

NA 

NA 

57 

ZHX2 

123863082 

124055936 

9 

0 

NA 

706280 

-7112653 

706280 

1 

NA 

0.29 

NA 

58 

ARHGEF5 

143683366 

143708657 

NA 

0 

NA 

NA 

-13747479 

-13747479 

1 

NA 

NA 

0.35 

59 

SDC2 

97575058 

97693213 

1 

1 

39 

1 032370 

-159108 

-159108 

1 

0.68 

NA 

NA 

60 

MYLK 

124811586 

125085868 

2 

0 

NA 

210407 

-8462769 

210407 

1 

NA 

0.38 

NA 

61 

LPHN3 

62045434 

62620762 

157 

0 

NA 

30824069 

NA 

30824069 

1 

NA 

NA 

0.24 

62 

MOSPD3 

100047661 

1 00050932 

0 

1 

12 

5043 

-3929 

-3929 

1 

NA 

NA 

0.17 

63 

GYS2 

21580390 

21649048 

NA 

NA 

NA 

NA 

NA 

NA 

1 

NA 

NA 

0.36 

64 

GAS8 

88616509 

88638880 

NA 

0 

NA 

NA 

-21813 

-21813 

1 

NA 

0.42 

NA 

65 

RAB9A 

13617262 

13637681 

191 

0 

NA 

35134932 

NA 

35134932 

1 

NA 

0.85 

NA 

66 

POLR3D 

22158564 

22164624 

1 

1 

25 

116113 

-12768 

-12768 

1 

NA 

0.33 

NA 

67 

PSD3 

18429093 

18915476 

0 

1 

23 

300007 

-126090 

-126090 

1 

3.23 

NA 

NA 

68 

ZFPM2 

106400323 

1 06885939 

2 

1 

41 

1444960 

-729979 

-729979 

1 

1.16 

2.58 

NA 

69 

ATP6V1C1 

104102424 

104154461 

5 

0 

NA 

427830 

-608753 

427830 

1 

NA 

0.21 

NA 

70 

MEF2C 

88051922 

88214780 

63 

0 

NA 

26727467 

-19278276 

-19278276 

1 

NA 

NA 

0.15 

71 

PKIA 

79590891 

79678040 

2 

0 

NA 

1 007876 

-1648815 

1 007876 

1 

0.63 

NA 

NA 

72 

ADAMTS18 

75873527 

76026512 

0 

1 

52 

287400 

-722891 

287400 

1 

0.74 

NA 

NA 

73 

STYXL1 

75463592 

75515257 

0 

1 

10 

72 

-1679 

72 

1 

NA 

NA 

0.17 

74 

EPM2A 

145988141 

146098684 

2 

1 

7 

291927 

-772282 

291927 

1 

NA 

NA 

0.20 

75 

LEPREL1 

191157213 

191321407 

NA 

1 

2 

NA 

-49278 

-49278 

1 

NA 

0.29 

NA 

76 

GABRA2 

45946341 

46086561 

NA 

0 

NA 

NA 

-1800760 

-1800760 

1 

NA 

NA 

0.16 

77 

RCOR2 

63435303 

63440892 

3 

0 

NA 

81715 

NA 

81715 

1 

NA 

NA 

0.00 

78 

MFHAS1 

8679409 

8788541 

0 

1 

15 

109315 

-5420413 

109315 

1 

0.63 

NA 

NA 
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79 

SCARA5 

156 

3.3 

-1 

925 

NA 

NA 

NA 

NA 

NA 

NA 

130 

101 

116 

8 

p21 .1 

80 

CCDC25 

155 

4.4 

-1 

995 

NA 

NA 

NA 

NA 

NA 

NA 

132 

100 

117 

8 

p21 .1 

81 

FAM38A 

323 

NA 

NA 

NA 

2.7 

-1 

885 

NA 

NA 

NA 

110 

130 

119 

16 

q24.3 

82 

CTSB 

104 

2.8 

-1 

941 

NA 

NA 

NA 

NA 

NA 

NA 

111 

136 

122 

8 

p23.1 

83 

PTK2 

235 

NA 

NA 

NA 

2.3 

1 

654 

NA 

NA 

NA 

107 

144 

123 

8 

q24.3 

84 

SPIRE2 

331 

NA 

NA 

NA 

1.7 

-1 

508 

NA 

NA 

NA 

124 

128 

124 

16 

q24.3 

85 

C13orf23 

246 

NA 

NA 

NA 

NA 

NA 

NA 

2.2139 

-1 

748 

141 

113 

125 

13 

ql  3.3 

86 

BOD1L 

20 

NA 

NA 

NA 

NA 

NA 

NA 

2.3508 

-1 

884 

129 

127 

126 

4 

pi  5.33 

87 

FAM160B2 

120 

2.5 

-1 

899 

1.8 

-1 

567 

NA 

NA 

NA 

127 

133 

129 

8 

p21 .3 

88 

NUS1 

48 

NA 

NA 

NA 

NA 

NA 

NA 

2.2269 

-1 

859 

123 

139 

130 

6 

q22.2 

89 

MTHFSD 

309 

NA 

NA 

NA 

2.4 

-1 

824 

NA 

NA 

NA 

112 

153 

131 

16 

q24.1 

90 

UBR5 

208 

NA 

NA 

NA 

2.2 

1 

733 

NA 

NA 

NA 

122 

155 

135.5 

8 

q22.3 

91 

GALNS 

325 

NA 

NA 

NA 

2.3 

-1 

856 

NA 

NA 

NA 

131 

147 

137 

16 

q24.3 

92 

FSTL5 

28 

NA 

NA 

NA 

NA 

NA 

NA 

2.2407 

-1 

641 

138 

143 

140 

4 

q32.2 

93 

SIM1 

46 

NA 

NA 

NA 

NA 

NA 

NA 

2.1943 

-1 

833 

120 

165 

141 

6 

ql  6.3 

94 

TG 

231 

3.8 

1 

997 

NA 

NA 

NA 

NA 

NA 

NA 

136 

149 

144 

8 

q24.22 

95 

BFSP2 

12 

NA 

NA 

NA 

2.4 

1 

678 

NA 

NA 

NA 

139 

154 

148 

3 

q22.1 

96 

MMP16 

194 

NA 

NA 

NA 

3.5 

1 

931 

NA 

NA 

NA 

158 

138 

149 

8 

q21 .3 

97 

RIMS2 

210 

2 

1 

692 

4 

1 

939 

NA 

NA 

NA 

161 

141 

150 

8 

q22.3 

98 

PDS5B 

245 

NA 

NA 

NA 

NA 

NA 

NA 

2.0408 

-1 

661 

145 

159 

151 

13 

ql  3. 1 

99 

CDK7 

31 

NA 

NA 

NA 

2.7 

-1 

988 

NA 

NA 

NA 

156 

148 

153 

5 

q13.2 

100 

CNTNAP4 

275 

3.2 

-1 

825 

NA 

NA 

NA 

NA 

NA 

NA 

196 

126 

156 

16 

q23.1 

101 

CFDP1 

274 

3 

-1 

925 

NA 

NA 

NA 

NA 

NA 

NA 

137 

187 

157 

16 

q23.1 

102 

FBXL4 

45 

NA 

NA 

NA 

NA 

NA 

NA 

1 .7473 

-1 

537 

154 

167 

158 

6 

q16.2 

103 

RFX1 

358 

NA 

NA 

NA 

NA 

NA 

NA 

2.1724 

1 

861 

134 

201 

163 

19 

pi  3. 1 2 

104 

NALCN 

256 

NA 

NA 

NA 

NA 

NA 

NA 

2.1846 

-1 

731 

182 

152 

165 

13 

q33.1 

105 

STX1A 

57 

NA 

NA 

NA 

NA 

NA 

NA 

2.1787 

1 

835 

177 

161 

167 

7 

ql  1 .23 

106 

CYP7B1 

162 

NA 

NA 

NA 

1.7 

1 

508 

NA 

NA 

NA 

147 

204 

168 

8 

q12.3 

107 

ARHGEF10 

92 

NA 

NA 

NA 

2.9 

-1 

923 

NA 

NA 

NA 

215 

145 

171 

8 

p23.3 

108 

ENTPD4 

141 

2.7 

-1 

875 

NA 

NA 

NA 

NA 

NA 

NA 

230 

137 

173 

8 

p21 .3 

109 

ZNF704 

188 

NA 

NA 

NA 

2.5 

1 

815 

NA 

NA 

NA 

211 

151 

174 

8 

q21 .13 

110 

C8orf79 

105 

2.9 

-1 

937 

NA 

NA 

NA 

NA 

NA 

NA 

163 

197 

176 

8 

p22 

111 

SLC9A9 

13 

NA 

NA 

NA 

2.7 

1 

746 

NA 

NA 

NA 

170 

189 

177 

3 

q24 

112 

CHMP7 

139 

NA 

NA 

NA 

2.4 

-1 

925 

NA 

NA 

NA 

185 

176 

178 

8 

p21 .3 

113 

GPC5 

255 

NA 

NA 

NA 

NA 

NA 

NA 

2.1374 

-1 

610 

171 

193 

180 

13 

q31 .3 

114 

MYC 

222 

4.2 

1 

972 

NA 

NA 

NA 

NA 

NA 

NA 

218 

157 

184 

8 

q24.21 

115 

STIP1 

239 

NA 

NA 

NA 

NA 

NA 

NA 

1 .7766 

1 

613 

164 

209 

185 

11 

ql  3. 1 

116 

ZBTB20 

9 

NA 

NA 

NA 

1.8 

1 

513 

NA 

NA 

NA 

187 

184 

186 

3 

ql  3.31 

117 

MEN1 

241 

NA 

NA 

NA 

NA 

NA 

NA 

2.0513 

1 

737 

176 

203 

188 

11 

ql  3.1 

118 

SLC26A7 

195 

NA 

NA 

NA 

2.2 

1 

747 

NA 

NA 

NA 

213 

168 

189 

8 

q21 .3 
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79 

SCARA5 

27783672 

27906117 

0 

1 

31 

29490 

-97583 

29490 

1 

0.63 

NA 

NA 

80 

CCDC25 

27646756 

27686089 

2 

0 

NA 

97583 

-188353 

97583 

1 

1.29 

NA 

NA 

81 

FAM38A 

87302916 

87330317 

0 

1 

59 

67370 

-2604 

-2604 

1 

NA 

0.33 

NA 

82 

CTSB 

1 1 737442 

1 1 763055 

7 

0 

NA 

1 084499 

-55179 

-55179 

1 

0.38 

NA 

NA 

83 

PTK2 

141737683 

142080514 

NA 

0 

NA 

NA 

-5943220 

-5943220 

1 

NA 

0.17 

NA 

84 

SPIRE2 

88422408 

88465228 

0 

1 

62 

2292 

-11842 

2292 

1 

NA 

0.00 

NA 

85 

C13orf23 

38482003 

38510252 

21 

0 

NA 

4175452 

-6231846 

4175452 

1 

NA 

NA 

0.14 

86 

BOD1L 

13179464 

13238426 

76 

0 

NA 

30632257 

NA 

30632257 

1 

NA 

NA 

0.19 

87 

FAM160B2 

22002660 

22017835 

0 

1 

24 

2493 

-82619 

2493 

1 

0.25 

0.02 

NA 

88 

NUS1 

118103310 

118138577 

15 

0 

NA 

5440605 

-16667349 

5440605 

1 

NA 

NA 

0.14 

89 

MTHFSD 

85121284 

85157509 

5 

1 

57 

1036491 

-15714 

-15714 

1 

NA 

0.21 

NA 

90 

UBR5 

103334748 

103493671 

3 

0 

NA 

608753 

-128437 

-128437 

1 

NA 

0.14 

NA 

91 

GALNS 

87407644 

87450885 

0 

1 

60 

122 

-4478 

122 

1 

NA 

0.17 

NA 

92 

FSTL5 

162524501 

1 63304636 

NA 

0 

NA 

NA 

-4017824 

-4017824 

1 

NA 

NA 

0.15 

93 

SIM1 

100939606 

101019494 

0 

1 

6 

43297 

-1437036 

43297 

1 

NA 

NA 

0.13 

94 

TG 

133948387 

134216325 

0 

1 

46 

-98170 

-18153 

-18153 

1 

0.91 

NA 

NA 

95 

BFSP2 

134601480 

134676746 

58 

0 

NA 

9790009 

-8678754 

-8678754 

1 

NA 

0.21 

NA 

96 

MMP16 

89118580 

89408833 

9 

0 

NA 

2921859 

-683360 

-683360 

1 

NA 

0.74 

NA 

97 

RIMS2 

104582291 

1 05333263 

1 

0 

NA 

127566 

-427830 

127566 

1 

0.07 

1.04 

NA 

98 

PDS5B 

32058564 

32250157 

21 

0 

NA 

6231846 

NA 

6231846 

1 

NA 

NA 

0.08 

99 

CDK7 

68566471 

68609004 

0 

1 

4 

3274 

11239 

3274 

1 

NA 

0.33 

NA 

100 

CNTNAP4 

74868677 

75150636 

1 

0 

NA 

722891 

-843789 

722891 

1 

0.57 

NA 

NA 

101 

CFDP1 

73885109 

74024888 

7 

1 

51 

843789 

-25657 

-25657 

1 

0.47 

NA 

NA 

102 

FBXL4 

99428055 

99502570 

7 

0 

NA 

1437036 

-5242062 

1437036 

1 

NA 

NA 

0.01 

103 

RFX1 

13933353 

13978097 

NA 

0 

NA 

NA 

-30662 

-30662 

1 

NA 

NA 

0.13 

104 

NALCN 

100504131 

100866814 

42 

0 

NA 

1 2420243 

-8187438 

-8187438 

1 

NA 

NA 

0.13 

105 

STX1A 

72751472 

72771925 

1 

0 

NA 

49338 

NA 

49338 

1 

NA 

NA 

0.13 

106 

CYP7B1 

65671246 

65873902 

21 

0 

NA 

2623061 

-5476925 

2623061 

1 

NA 

0.00 

NA 

107 

ARHGEF10 

1 759549 

1894206 

1 

0 

NA 

86359 

-115501 

86359 

1 

NA 

0.42 

NA 

108 

ENTPD4 

23299386 

23371081 

0 

1 

28 

71227 

18281 

18281 

1 

0.33 

NA 

NA 

109 

ZNF704 

81713324 

81949571 

0 

1 

37 

93034 

-870671 

93034 

1 

NA 

0.25 

NA 

110 

C8orf79 

12847554 

12931653 

0 

1 

19 

53590 

-1084499 

53590 

1 

0.42 

NA 

NA 

111 

SLC9A9 

144466755 

145049979 

50 

0 

NA 

12271116 

-9790009 

-9790009 

1 

NA 

0.33 

NA 

112 

CHMP7 

23157095 

23175450 

1 

1 

27 

34647 

-18511 

-18511 

1 

NA 

0.21 

NA 

113 

GPC5 

90848919 

92316693 

29 

0 

NA 

8187438 

-19509588 

8187438 

1 

NA 

NA 

0.11 

114 

MYC 

128816862 

128822853 

0 

1 

44 

206193 

-318241 

206193 

1 

1.16 

NA 

NA 

115 

STIP1 

63709873 

63728596 

20 

0 

NA 

560058 

-19764 

-19764 

1 

NA 

NA 

0.01 

116 

ZBTB20 

1 1 5540230 

116348817 

51 

0 

NA 

8462769 

-8761797 

8462769 

1 

NA 

0.02 

NA 

117 

MEN1 

64327564 

64335342 

0 

1 

47 

12898 

-24747 

12898 

1 

NA 

NA 

0.09 

118 

SLC26A7 

92330692 

92479554 

5 

0 

NA 

2729012 

-2921859 

2729012 

1 

NA 

0.14 

NA 
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119 

ALCAM 

8 

NA 

NA 

NA 

NA 

NA 

NA 

2.4602 

1 

586 

194 

186 

191 

3 

ql  3. 1 1 

120 

KIF13B 

160 

2.7 

-1 

854 

NA 

NA 

NA 

NA 

NA 

NA 

188 

194 

192 

8 

p21 .1 

121 

MBTPS1 

291 

2.7 

-1 

906 

NA 

NA 

NA 

NA 

NA 

NA 

193 

192 

193 

16 

q24.1 

122 

PPP2R5B 

243 

NA 

NA 

NA 

NA 

NA 

NA 

1.8055 

1 

580 

189 

202 

196 

11 

ql  3. 1 

123 

VPS13C 

260 

NA 

NA 

NA 

NA 

NA 

NA 

1 .7860 

-1 

550 

201 

190 

197 

15 

q22.2 

124 

ASPSCR1 

346 

NA 

NA 

NA 

NA 

NA 

NA 

1 .7635 

1 

549 

219 

178 

198 

17 

q25.3 

125 

EPO 

82 

NA 

NA 

NA 

NA 

NA 

NA 

1 .9843 

1 

735 

169 

235 

201 

7 

q22.1 

126 

HEY1 

187 

3 

1 

988 

NA 

NA 

NA 

NA 

NA 

NA 

206 

195 

203 

8 

q21 .13 

127 

KALRN 

11 

NA 

NA 

NA 

2.4 

1 

674 

NA 

NA 

NA 

197 

205 

204 

3 

q21 .1 

128 

RGS22 

203 

2.7 

1 

956 

NA 

NA 

NA 

NA 

NA 

NA 

191 

215 

205 

8 

q22.2 

129 

WDR7 

353 

NA 

NA 

NA 

NA 

NA 

NA 

1.9953 

-1 

653 

200 

217 

210 

18 

q21 .31 

130 

COL11A1 

5 

NA 

NA 

NA 

NA 

NA 

NA 

1 .8924 

-1 

591 

233 

206 

213 

1 

p21 .1 

131 

GHDC 

344 

NA 

NA 

NA 

NA 

NA 

NA 

1 .7523 

1 

523 

221 

218 

215 

17 

q21 .2 

132 

ATP2C2 

295 

3.6 

-1 

943 

NA 

NA 

NA 

NA 

NA 

NA 

216 

226 

216 

16 

q24.1 

133 

CDH17 

196 

2.8 

1 

976 

NA 

NA 

NA 

NA 

NA 

NA 

227 

216 

217 

8 

q22.1 

134 

DGKG 

17 

NA 

NA 

NA 

1.9 

1 

568 

NA 

NA 

NA 

192 

258 

219 

3 

q27.3 

135 

GRK5 

236 

NA 

NA 

NA 

2.4 

-1 

831 

NA 

NA 

NA 

210 

237 

220 

10 

q26.1 1 

136 

GRM1 

52 

NA 

NA 

NA 

NA 

NA 

NA 

1 .8988 

-1 

587 

179 

283 

223 

6 

q24.3 

137 

IMPA1 

190 

NA 

NA 

NA 

1.9 

1 

647 

NA 

NA 

NA 

243 

210 

224 

8 

q21 .13 

138 

RPL7 

176 

2.3 

1 

813 

NA 

NA 

NA 

NA 

NA 

NA 

261 

211 

229 

8 

q21 .1 1 

139 

COL21A1 

38 

NA 

NA 

NA 

NA 

NA 

NA 

1.8391 

-1 

596 

235 

246 

232 

6 

pi  2.1 

140 

COL12A1 

40 

NA 

NA 

NA 

NA 

NA 

NA 

1 .8241 

-1 

597 

241 

240 

233 

6 

ql  4. 1 

141 

MLYCD 

289 

2.4 

-1 

819 

NA 

NA 

NA 

NA 

NA 

NA 

234 

248 

234 

16 

q23.3 

142 

AR 

366 

2.3 

1 

690 

2.6 

1 

806 

NA 

NA 

NA 

266 

221 

235 

23 

ql  2 

143 

PLCB1 

359 

NA 

NA 

NA 

NA 

NA 

NA 

1.9352 

-1 

579 

181 

330 

240 

20 

pi  2.3 

144 

ACTL8 

3 

NA 

NA 

NA 

1.9 

-1 

582 

NA 

NA 

NA 

264 

229 

242 

1 

p36.13 

145 

TFDP1 

257 

NA 

NA 

NA 

2.3 

-1 

729 

NA 

NA 

NA 

205 

304 

248 

13 

q34 

146 

IQCE 

55 

NA 

NA 

NA 

NA 

NA 

NA 

1 .8487 

1 

580 

250 

260 

255 

7 

p22.2 

147 

SMARCB1 

360 

NA 

NA 

NA 

1.8 

-1 

523 

NA 

NA 

NA 

239 

276 

256 

22 

ql  1 .23 

148 

MTDH 

199 

NA 

NA 

NA 

1.9 

1 

584 

NA 

NA 

NA 

225 

301 

259 

8 

q22.1 

149 

NECAB2 

290 

NA 

NA 

NA 

2 

-1 

688 

NA 

NA 

NA 

255 

271 

262 

16 

q23.3 

150 

DEF8 

334 

NA 

NA 

NA 

1.9 

-1 

678 

NA 

NA 

NA 

214 

335 

266 

16 

q24.3 

151 

RNF40 

262 

NA 

NA 

NA 

NA 

NA 

NA 

2.0578 

1 

774 

320 

227 

270 

16 

pi  1.2 

152 

TICAM2 

37 

NA 

NA 

NA 

NA 

NA 

NA 

1 .8257 

-1 

589 

303 

241 

271 

5 

q22.3 

153 

GLG1 

271 

2.1 

-1 

647 

NA 

NA 

NA 

NA 

NA 

NA 

327 

225 

273 

16 

q22.3 

154 

MECOM 

16 

NA 

NA 

NA 

2 

1 

587 

NA 

NA 

NA 

279 

268 

277 

3 

q26.2 

155 

TCEB1 

178 

1.8 

1 

590 

NA 

NA 

NA 

NA 

NA 

NA 

275 

277 

279 

8 

q21 .1 1 

156 

CTNNA2 

7 

NA 

NA 

NA 

NA 

NA 

NA 

1 .8228 

-1 

538 

331 

231 

280 

2 

pi  2 

157 

NIPAL2 

200 

1.9 

1 

654 

NA 

NA 

NA 

NA 

NA 

NA 

289 

265 

282 

8 

q22.2 

158 

CDCA2 

146 

2 

-1 

686 

NA 

NA 

NA 

NA 

NA 

NA 

301 

255 

283 

8 

p21 .2 
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119 

ALCAM 

106568403 

1 06778433 

49 

0 

NA 

8761797 

NA 

8761 797 

1 

NA 

NA 

0.23 

120 

KIF13B 

28980715 

29176529 

NA 

1 

32 

NA 

-14009 

-14009 

1 

0.33 

NA 

NA 

121 

MBTPS1 

82644872 

82708018 

0 

1 

54 

5371 

-50994 

5371 

1 

0.33 

NA 

NA 

122 

PPP2R5B 

64448756 

64458523 

NA 

0 

NA 

NA 

-80139 

-80139 

1 

NA 

NA 

0.02 

123 

VPS13C 

59931884 

60139939 

NA 

0 

NA 

NA 

-24742144 

-24742144 

1 

NA 

NA 

0.02 

124 

ASPSCR1 

77528715 

77568569 

6 

1 

65 

40474 

-16362 

-16362 

1 

NA 

NA 

0.01 

125 

EPO 

100156359 

100159257 

146 

0 

NA 

29534682 

-31553 

-31553 

1 

NA 

NA 

0.07 

126 

HEY1 

80838801 

80842653 

3 

1 

36 

870671 

-97933 

-97933 

1 

0.47 

NA 

NA 

127 

KALRN 

125296275 

125922726 

76 

0 

NA 

8678754 

-210407 

-210407 

1 

NA 

0.21 

NA 

128 

RGS22 

101042452 

101187520 

7 

0 

NA 

812460 

-83469 

-83469 

1 

0.33 

NA 

NA 

129 

WDR7 

52469614 

52848040 

45 

0 

NA 

12833135 

-3157834 

-3157834 

1 

NA 

NA 

0.07 

130 

COL11A1 

103114611 

1 03346640 

NA 

0 

NA 

NA 

-4955408 

-4955408 

1 

NA 

NA 

0.04 

131 

GHDC 

37594632 

37599722 

482 

0 

NA 

39903967 

-7810 

-7810 

1 

NA 

NA 

0.01 

132 

ATP2C2 

82959634 

83055293 

0 

1 

55 

13315 

-38746 

13315 

1 

0.80 

NA 

NA 

133 

CDH17 

95208566 

95289986 

14 

0 

NA 

2053354 

-2729012 

2053354 

1 

0.38 

NA 

NA 

134 

DGKG 

187347686 

187562717 

23 

0 

NA 

3269193 

-17000632 

3269193 

1 

NA 

0.05 

NA 

135 

GRK5 

120957091 

121205118 

NA 

NA 

NA 

NA 

NA 

NA 

1 

NA 

0.21 

NA 

136 

GRM1 

146390611 

146800427 

83 

0 

NA 

18812721 

-291927 

-291927 

1 

NA 

NA 

0.04 

137 

IMPA1 

82732751 

82761115 

4 

0 

NA 

2842997 

-545893 

-545893 

1 

NA 

0.05 

NA 

138 

RPL7 

74365073 

74375857 

1 

0 

NA 

1 1 9303 

-352193 

119303 

1 

0.17 

NA 

NA 

139 

COL21A1 

56029347 

56366851 

NA 

NA 

NA 

NA 

NA 

NA 

1 

NA 

NA 

0.03 

140 

COL12A1 

75850762 

75972343 

18 

0 

NA 

7686493 

-4871884 

-4871884 

1 

NA 

NA 

0.03 

141 

MLYCD 

82490231 

82507286 

1 

0 

NA 

52452 

-102526 

52452 

1 

0.21 

NA 

NA 

142 

AR 

66680599 

66860844 

0 

1 

69 

318596 

-904991 

318596 

1 

0.17 

0.29 

NA 

143 

PLCB1 

8061296 

8813547 

NA 

NA 

NA 

NA 

NA 

NA 

1 

NA 

NA 

0.05 

144 

ACTL8 

17954395 

18026145 

662 

1 

1 

79289745 

-57439 

-57439 

1 

NA 

0.05 

NA 

145 

TFDP1 

1 1 3287057 

113343500 

NA 

0 

NA 

NA 

-12420243 

-12420243 

1 

NA 

0.17 

NA 

146 

IQCE 

2565158 

2620893 

13 

0 

NA 

2861062 

NA 

2861062 

1 

NA 

NA 

0.03 

147 

SMARCB1 

22459150 

22506703 

290 

0 

NA 

23030490 

NA 

23030490 

1 

NA 

0.02 

NA 

148 

MTDH 

98725583 

9880771 1 

7 

0 

NA 

465852 

-1032370 

465852 

1 

NA 

0.05 

NA 

149 

NECAB2 

82559738 

82593878 

1 

0 

NA 

50994 

-52452 

50994 

1 

NA 

0.07 

NA 

150 

DEF8 

88542684 

88561968 

0 

1 

63 

4521 

-12678 

4521 

1 

NA 

0.05 

NA 

151 

RNF40 

30681100 

30695129 

NA 

0 

NA 

NA 

-392513 

-392513 

1 

NA 

NA 

0.09 

152 

TICAM2 

114942247 

114989610 

NA 

0 

NA 

NA 

-26727467 

-26727467 

1 

NA 

NA 

0.03 

153 

GLG1 

73043357 

73198518 

3 

0 

NA 

266457 

-1403582 

266457 

1 

0.10 

NA 

NA 

154 

MECOM 

170283981 

1 70347054 

89 

0 

NA 

17000632 

-8012470 

-8012470 

1 

NA 

0.07 

NA 

155 

TCEB1 

75021184 

75046959 

2 

0 

NA 

262534 

-199555 

-199555 

1 

0.02 

NA 

NA 

156 

CTNNA2 

79732191 

8072941 5 

NA 

0 

NA 

NA 

-28619013 

-28619013 

1 

NA 

NA 

0.03 

157 

NIPAL2 

99273563 

99375797 

1 

0 

NA 

160240 

-465852 

160240 

1 

0.05 

NA 

NA 

158 

CDCA2 

25372428 

25421353 

0 

1 

29 

336689 

-591 

-591 

1 

0.07 

NA 

NA 
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159 

WWP2 

267 

1.8 

-1 

527 

NA 

NA 

NA 

NA 

NA 

NA 

251 

315 

284 

16 

q22.1 

160 

DDX19A 

268 

2.3 

-1 

756 

NA 

NA 

NA 

NA 

NA 

NA 

220 

363 

285 

16 

q22.1 

161 

STK3 

201 

1.8 

1 

614 

NA 

NA 

NA 

NA 

NA 

NA 

265 

309 

287 

8 

q22.2 

162 

DNAH2 

337 

1.8 

-1 

541 

NA 

NA 

NA 

NA 

NA 

NA 

247 

332 

288 

17 

pi  3.1 

163 

NFAT5 

266 

2.3 

-1 

760 

NA 

NA 

NA 

NA 

NA 

NA 

326 

254 

291 

16 

q22.1 

164 

CNGB1 

263 

1.8 

-1 

524 

NA 

NA 

NA 

NA 

NA 

NA 

297 

280 

292 

16 

q13 

165 

UBE2CBP 

41 

2.8 

-1 

891 

NA 

NA 

NA 

NA 

NA 

NA 

256 

326 

293 

6 

ql  4. 1 

166 

C8orf16 

99 

2.2 

-1 

725 

NA 

NA 

NA 

NA 

NA 

NA 

285 

293 

294 

8 

p23.1 

167 

KIAA0196 

220 

2.6 

1 

819 

NA 

NA 

NA 

NA 

NA 

NA 

253 

334 

296 

8 

q24.13 

168 

CLCNKB 

1 

NA 

NA 

NA 

NA 

NA 

NA 

2.0014 

1 

746 

276 

307 

297 

1 

p36.13 

169 

C16orf80 

264 

2.2 

-1 

677 

NA 

NA 

NA 

NA 

NA 

NA 

281 

302 

298 

16 

q21 

170 

ZFHX3 

270 

2.2 

-1 

656 

NA 

NA 

NA 

NA 

NA 

NA 

313 

273 

299 

16 

q22.3 

171 

PPM1L 

15 

NA 

NA 

NA 

2 

1 

628 

NA 

NA 

NA 

270 

329 

303 

3 

q26.1 

172 

NKIRAS2 

338 

NA 

NA 

NA 

NA 

NA 

NA 

1.9634 

1 

679 

298 

299 

304 

17 

q21 .2 

173 

RSP02 

215 

1.8 

1 

550 

NA 

NA 

NA 

NA 

NA 

NA 

306 

292 

305 

8 

q23.1 

174 

XP07 

119 

2.3 

-1 

735 

NA 

NA 

NA 

NA 

NA 

NA 

329 

272 

306 

8 

p21 .3 

175 

ME1 

42 

2.5 

-1 

728 

NA 

NA 

NA 

NA 

NA 

NA 

282 

321 

307 

6 

q14.2 

176 

NLGN4Y 

368 

NA 

NA 

NA 

NA 

NA 

NA 

2.4188 

-1 

734 

339 

275 

312 

24 

ql 1.221 

177 

LZTS1 

118 

2 

-1 

645 

NA 

NA 

NA 

NA 

NA 

NA 

300 

316 

316 

8 

p21 .3 

178 

FBXL18 

56 

NA 

NA 

NA 

NA 

NA 

NA 

1 .8646 

1 

652 

323 

294 

317 

7 

p22.1 

179 

TBC1D10B 

261 

NA 

NA 

NA 

NA 

NA 

NA 

1 .8243 

1 

573 

278 

347 

321 

16 

pi  1.2 

180 

WDR59 

272 

2.1 

-1 

653 

NA 

NA 

NA 

NA 

NA 

NA 

304 

320 

322 

16 

q23.1 

181 

BLK 

101 

2.1 

-1 

671 

NA 

NA 

NA 

NA 

NA 

NA 

315 

314 

325 

8 

p23.1 

182 

MEPCE 

71 

NA 

NA 

NA 

NA 

NA 

NA 

2.1134 

1 

782 

350 

285 

327 

7 

q22.1 

183 

DLGAP2 

91 

NA 

NA 

NA 

2.2 

-1 

682 

NA 

NA 

NA 

356 

286 

330 

8 

p23.3 

184 

ZFAT 

234 

2.5 

1 

796 

NA 

NA 

NA 

NA 

NA 

NA 

325 

317 

331 

8 

q24.22 

185 

FASN 

348 

NA 

NA 

NA 

NA 

NA 

NA 

3.0027 

1 

963 

296 

350 

332 

17 

q25.3 

186 

GIGYF1 

81 

NA 

NA 

NA 

NA 

NA 

NA 

2.7127 

1 

957 

335 

311 

335 

7 

q22.1 

187 

ANXA1 3 

219 

2.1 

1 

682 

NA 

NA 

NA 

NA 

NA 

NA 

310 

345 

336 

8 

q24.13 

188 

CDYL2 

280 

2.5 

-1 

699 

NA 

NA 

NA 

NA 

NA 

NA 

316 

351 

339 

16 

q23.2 

189 

TOX 

161 

4.3 

1 

993 

NA 

NA 

NA 

NA 

NA 

NA 

338 

342 

349 

8 

ql  2.1 

190 

NKX2-6 

143 

2.4 

-1 

870 

NA 

NA 

NA 

NA 

NA 

NA 

340 

366 

357 

8 

p21 .2 

191 

RALYL 

191 

2.8 

1 

985 

NA 

NA 

NA 

NA 

NA 

NA 

345 

362 

359 

8 

q21 .2 

192 

TBC1 D22A 

361 

NA 

NA 

NA 

4.6 

-1 

999 

NA 

NA 

NA 

367 

346 

363 

22 

ql  3.31 

193 

TFE3 

363 

NA 

NA 

NA 

2.1 

1 

591 

NA 

NA 

NA 

362 

353 

364 

23 

pi  1.23 

194 

KCNAB1 

14 

NA 

NA 

NA 

5.8 

1 

996 

NA 

NA 

NA 

363 

367 

367 

3 

q25.31 

195 

SULF1 

166 

5.2 

1 

1000 

3.4 

1 

994 

NA 

NA 

NA 

3 

4 

3 

8 

ql  3.2 

196 

RAB5C 

342 

NA 

NA 

NA 

NA 

NA 

NA 

3.5399 

1 

998 

6 

3 

5 

17 

q21 .2 

197 

DHX58 

339 

NA 

NA 

NA 

NA 

NA 

NA 

8.9116 

1 

952 

14 

16 

16 

17 

q21 .2 

198 

ASAP1 

224 

NA 

NA 

NA 

3.6 

1 

974 

NA 

NA 

NA 

21 

23 

20 

8 

q24.21 
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159 

WWP2 

68353710 

68533145 

5 

0 

NA 

405177 

-57656 

-57656 

1 

0.02 

NA 

NA 

160 

DDX19A 

68938322 

68964780 

0 

1 

50 

6059 

-405177 

6059 

1 

0.17 

NA 

NA 

161 

STK3 

99536037 

99907074 

1 

0 

NA 

187596 

-160240 

-160240 

1 

0.02 

NA 

NA 

162 

DNAH2 

7562746 

7677783 

NA 

NA 

NA 

NA 

NA 

NA 

1 

0.02 

NA 

NA 

163 

NFAT5 

68156498 

68296054 

2 

0 

NA 

57656 

-7528258 

57656 

1 

0.17 

NA 

NA 

164 

CNGB1 

56475004 

56562513 

3 

0 

NA 

142487 

NA 

142487 

1 

0.02 

NA 

NA 

165 

UBE2CBP 

83658836 

83832269 

3 

0 

NA 

144558 

-7686493 

144558 

1 

0.38 

NA 

NA 

166 

C8orf16 

11021390 

11025155 

0 

1 

17 

154255 

-697587 

154255 

1 

0.14 

NA 

NA 

167 

KIAA0196 

126105691 

126173191 

3 

0 

NA 

2323848 

-1286863 

-1286863 

1 

0.29 

NA 

NA 

168 

CLCNKB 

16242834 

16256390 

29 

0 

NA 

1482527 

NA 

1482527 

1 

NA 

NA 

0.07 

169 

Cl  6orf80 

56705000 

56720797 

10 

0 

NA 

3524069 

-142487 

-142487 

1 

0.14 

NA 

NA 

170 

ZFHX3 

71374285 

71639775 

2 

0 

NA 

1403582 

-2343793 

1403582 

1 

0.14 

NA 

NA 

171 

PPM1L 

161956791 

162271511 

13 

0 

NA 

8012470 

-4217170 

-421 71 70 

1 

NA 

0.07 

NA 

172 

NKIRAS2 

37422564 

37431180 

1 

0 

NA 

75799 

NA 

75799 

1 

NA 

NA 

0.06 

173 

RSP02 

108980721 

109165052 

9 

1 

42 

4139285 

-401262 

-401262 

1 

0.02 

NA 

NA 

174 

XP07 

21833126 

21920041 

3 

0 

NA 

82619 

-1627372 

82619 

1 

0.17 

NA 

NA 

175 

ME1 

83976827 

84197498 

41 

0 

NA 

7084576 

-144558 

-144558 

1 

0.25 

NA 

NA 

176 

NLGN4Y 

15144026 

15466924 

NA 

NA 

NA 

NA 

NA 

NA 

1 

NA 

NA 

0.21 

177 

LZTS1 

20147956 

20205754 

2 

0 

NA 

1 627372 

-850362 

-850362 

1 

0.07 

NA 

NA 

178 

FBXL18 

5481955 

5523646 

NA 

0 

NA 

NA 

-2861062 

-2861062 

1 

NA 

NA 

0.04 

179 

TBC1D10B 

30275925 

30288587 

14 

0 

NA 

392513 

NA 

392513 

1 

NA 

NA 

0.03 

180 

WDR59 

73464975 

73576518 

5 

0 

NA 

24391 1 

-266457 

24391 1 

1 

0.10 

NA 

NA 

181 

BLK 

11388930 

11459516 

1 

0 

NA 

139646 

-165868 

139646 

1 

0.10 

NA 

NA 

182 

MEPCE 

99865190 

99869676 

2 

0 

NA 

32404 

-29540 

-29540 

1 

NA 

NA 

0.11 

183 

DLGAP2 

1436939 

1644048 

1 

0 

NA 

115501 

NA 

115501 

1 

NA 

0.14 

NA 

184 

ZFAT 

135559215 

135794463 

8 

0 

NA 

5943220 

-1248464 

-1248464 

1 

0.25 

NA 

NA 

185 

FASN 

77629504 

77649395 

NA 

1 

66 

NA 

-262 

-262 

1 

NA 

NA 

0.47 

186 

GIGYF1 

100115066 

100124806 

1 

0 

NA 

31553 

-23059 

-23059 

1 

NA 

NA 

0.33 

187 

ANXA1 3 

124762216 

124818828 

11 

0 

NA 

1 286863 

-706280 

-706280 

1 

0.10 

NA 

NA 

188 

CDYL2 

79195176 

79395680 

3 

0 

NA 

248923 

-1391644 

248923 

1 

0.25 

NA 

NA 

189 

TOX 

59880531 

60194321 

10 

0 

NA 

5476925 

NA 

5476925 

1 

1.23 

NA 

NA 

190 

NKX2-6 

23615909 

23620056 

6 

0 

NA 

1478148 

-129901 

-129901 

1 

0.21 

NA 

NA 

191 

RALYL 

85604112 

85963979 

12 

0 

NA 

1691298 

-2842997 

1691298 

1 

0.38 

NA 

NA 

192 

TBC1D22A 

45537193 

45948399 

NA 

0 

NA 

NA 

-23030490 

-23030490 

1 

NA 

1.43 

NA 

193 

TFE3 

48772613 

48787722 

NA 

0 

NA 

NA 

-35134932 

-35134932 

1 

NA 

0.10 

NA 

194 

KCNAB1 

157321095 

157739621 

22 

0 

NA 

4217170 

-12271116 

4217170 

1 

NA 

2.24 

NA 

195 

SULF1 

70541427 

70735701 

0 

1 

33 

11428 

-647617 

11428 

0 

1.83 

0.68 

NA 

196 

RAB5C 

37530524 

37560548 

0 

1 

64 

1891 

-1627 

-1627 

0 

NA 

NA 

0.76 

197 

DHX58 

37506979 

37518277 

0 

1 

64 

380 

-75799 

380 

0 

NA 

NA 

4.22 

198 

ASAP1 

131133535 

131483399 

0 

1 

45 

378337 

-2104073 

378337 

0 

NA 

0.80 

NA 
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199 

CA5A 

313 

2.6 

-1 

832 

3.8 

-1 

955 

NA 

NA 

NA 

23 

28 

26 

16 

q24.2 

200 

C6orf  1 1 8 

53 

NA 

NA 

NA 

NA 

NA 

NA 

2.7921 

-1 

976 

36 

36 

36 

6 

q27 

201 

NCOA2 

169 

3.2 

1 

997 

2.4 

1 

806 

NA 

NA 

NA 

35 

40 

37 

8 

ql  3.3 

202 

PKD1L2 

283 

4.9 

-1 

999 

2 

-1 

715 

NA 

NA 

NA 

41 

45 

42 

16 

q23.2 

203 

BANP 

314 

2.6 

-1 

901 

3.3 

-1 

957 

NA 

NA 

NA 

42 

49 

46 

16 

q24.2 

204 

KIAA1967 

133 

2.8 

-1 

925 

3.1 

-1 

989 

NA 

NA 

NA 

50 

57 

52 

8 

p21 .3 

205 

COPG2 

89 

NA 

NA 

NA 

NA 

NA 

NA 

3.1195 

1 

936 

56 

52 

53 

7 

q32.2 

206 

ZNF706 

205 

NA 

NA 

NA 

2.8 

1 

889 

NA 

NA 

NA 

53 

56 

54 

8 

q22.3 

207 

GAN 

285 

2.7 

-1 

869 

2.4 

-1 

902 

NA 

NA 

NA 

49 

61 

55 

16 

q23.2 

208 

PLCG2 

286 

2.9 

-1 

833 

2.7 

-1 

913 

NA 

NA 

NA 

61 

50 

56 

16 

q23.2 

209 

C19orf57 

356 

NA 

NA 

NA 

NA 

NA 

NA 

2.7945 

1 

992 

58 

58 

59 

19 

pi  3.1 2 

210 

PDGFRL 

111 

4.8 

-1 

998 

NA 

NA 

NA 

NA 

NA 

NA 

62 

55 

60 

8 

p22 

211 

ESD 

249 

NA 

NA 

NA 

NA 

NA 

NA 

2.5793 

-1 

973 

65 

59 

62 

13 

q14.2 

212 

CPA5 

85 

NA 

NA 

NA 

NA 

NA 

NA 

2.7623 

1 

924 

70 

63 

67 

7 

q32.2 

213 

BIN3 

134 

1.7 

-1 

507 

2.8 

-1 

992 

NA 

NA 

NA 

71 

73 

70 

8 

p21 .3 

214 

ZFHX4 

184 

4.3 

1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

74 

76 

73 

8 

q21 .1 1 

215 

CPA6 

163 

3.8 

1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

77 

81 

75 

8 

ql  3.2 

216 

EYA1 

172 

3.4 

1 

997 

NA 

NA 

NA 

NA 

NA 

NA 

73 

89 

77 

8 

ql  3.3 

217 

CHRNA2 

153 

3.5 

-1 

999 

NA 

NA 

NA 

NA 

NA 

NA 

76 

87 

79 

8 

p21 .2 

218 

TNKS 

97 

4 

-1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

87 

84 

84 

8 

p23.1 

219 

HNF4G 

183 

4.1 

1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

103 

72 

87 

8 

q21 .1 1 

220 

LRCH1 

248 

NA 

NA 

NA 

NA 

NA 

NA 

2.3847 

-1 

801 

79 

94 

88 

13 

ql  4.1 3 

221 

ADRA1A 

149 

3.9 

-1 

991 

NA 

NA 

NA 

NA 

NA 

NA 

96 

79 

89 

8 

p21 .2 

222 

EPHX2 

154 

3.3 

-1 

997 

NA 

NA 

NA 

NA 

NA 

NA 

89 

88 

90 

8 

p21 .1 

223 

SORBS3 

130 

NA 

NA 

NA 

3 

-1 

957 

NA 

NA 

NA 

83 

95 

91 

8 

p21 .3 

224 

GRIA2 

27 

NA 

NA 

NA 

NA 

NA 

NA 

2.2933 

-1 

843 

88 

96 

93 

4 

q32.1 

225 

PDLIM2 

131 

NA 

NA 

NA 

2.9 

-1 

993 

NA 

NA 

NA 

94 

91 

94 

8 

p21 .3 

226 

MTMR7 

109 

3.7 

-1 

971 

NA 

NA 

NA 

NA 

NA 

NA 

86 

106 

96 

8 

p22 

227 

FBX024 

76 

NA 

NA 

NA 

NA 

NA 

NA 

2.4831 

1 

817 

118 

85 

99 

7 

q22.1 

228 

CRISPLD1 

182 

4.9 

1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

95 

124 

107 

8 

q21 .1 1 

229 

DPYS 

211 

3.2 

1 

976 

NA 

NA 

NA 

NA 

NA 

NA 

92 

129 

109 

8 

q22.3 

230 

DTNA 

351 

NA 

NA 

NA 

NA 

NA 

NA 

2.2378 

-1 

734 

102 

125 

112 

18 

ql  2. 1 

231 

KLHDC4 

311 

NA 

NA 

NA 

2.5 

-1 

987 

NA 

NA 

NA 

116 

111 

113 

16 

q24.2 

232 

CYBA 

319 

NA 

NA 

NA 

2.9 

-1 

941 

NA 

NA 

NA 

117 

121 

118 

16 

q24.3 

233 

J  PH3 

310 

2.4 

-1 

766 

2.4 

-1 

908 

NA 

NA 

NA 

101 

142 

120 

16 

q24.2 

234 

TMEM120A 

64 

NA 

NA 

NA 

NA 

NA 

NA 

1 .7093 

1 

511 

128 

115 

121 

7 

ql  1 .23 

235 

MTUS1 

112 

3.6 

-1 

976 

NA 

NA 

NA 

NA 

NA 

NA 

143 

116 

127 

8 

p22 

236 

C8orf34 

165 

6 

1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

126 

132 

128 

8 

q13.2 

237 

GRHL2 

206 

NA 

NA 

NA 

2.4 

1 

790 

NA 

NA 

NA 

125 

140 

132 

8 

q22.3 

238 

CPA2 

83 

NA 

NA 

NA 

NA 

NA 

NA 

2.1399 

1 

717 

153 

117 

133 

7 

q32.2 
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199 

CA5A 

86479126 

8652761 3 

0 

1 

58 

14926 

-18511 

14926 

0 

0.29 

0.91 

NA 

200 

C6orf  1 1 8 

165613148 

165643101 

0 

1 

8 

17665 

-18812721 

17665 

0 

NA 

NA 

0.37 

201 

NCOA2 

71178380 

71478574 

1 

1 

33 

233471 

-32264 

-32264 

0 

0.57 

0.21 

NA 

202 

PKD1L2 

79691985 

79811477 

0 

1 

53 

18320 

-4504 

-4504 

0 

1.63 

0.07 

NA 

203 

BANP 

86542539 

86668425 

0 

1 

58 

378801 

-14926 

-14926 

0 

0.29 

0.63 

NA 

204 

KIAA1967 

22518202 

22533920 

0 

1 

26 

-14 

-597 

-14 

0 

0.38 

0.52 

NA 

205 

COPG2 

129933404 

129935887 

106 

1 

13 

1 3747479 

-41 

-41 

0 

NA 

NA 

0.53 

206 

ZNF706 

102278444 

102287136 

0 

1 

40 

287026 

-243699 

-243699 

0 

NA 

0.38 

NA 

207 

GAN 

79906076 

79971441 

0 

1 

53 

398967 

-23828 

-23828 

0 

0.33 

0.21 

NA 

208 

PLCG2 

80370408 

80549399 

0 

1 

53 

76965 

-398967 

76965 

0 

0.42 

0.33 

NA 

209 

Cl  9orf57 

13854168 

13877909 

0 

1 

68 

105 

-50124 

105 

0 

NA 

NA 

0.37 

210 

PDGFRL 

17478443 

17545655 

0 

1 

21 

-71 

-6086 

-71 

0 

1.56 

NA 

NA 

211 

ESD 

46243393 

46269368 

0 

1 

48 

36146 

-20607 

-20607 

0 

NA 

NA 

0.28 

212 

CPA5 

129771892 

129795807 

0 

1 

13 

11661 

-20643 

11661 

0 

NA 

NA 

0.36 

213 

BIN3 

22533906 

22582553 

0 

1 

26 

18566 

14 

14 

0 

0.00 

0.38 

NA 

214 

ZFHX4 

77756078 

77942076 

1 

1 

35 

1648815 

-1114478 

-1114478 

0 

1.23 

NA 

NA 

215 

CPA6 

68496963 

68821134 

0 

1 

33 

205773 

-2623061 

205773 

0 

0.91 

NA 

NA 

216 

EYA1 

72272222 

72437021 

0 

1 

34 

47931 1 

-463009 

-463009 

0 

0.68 

NA 

NA 

217 

CHRNA2 

27373196 

27392730 

0 

1 

30 

11832 

-376 

-376 

0 

0.74 

NA 

NA 

218 

TNKS 

9450855 

9677266 

0 

1 

16 

271923 

-522716 

271923 

0 

1.04 

NA 

NA 

219 

HNF4G 

76482732 

76641600 

0 

1 

35 

1114478 

-373386 

-373386 

0 

1.10 

NA 

NA 

220 

LRCH1 

46025304 

46222786 

0 

1 

48 

20607 

-2766260 

20607 

0 

NA 

NA 

0.20 

221 

ADRA1A 

26661584 

26778839 

0 

1 

30 

370899 

-89977 

-89977 

0 

0.98 

NA 

NA 

222 

EPHX2 

27404562 

27458403 

2 

1 

30 

188353 

-11832 

-11832 

0 

0.63 

NA 

NA 

223 

SORBS3 

22465196 

22488952 

0 

1 

26 

3247 

-10616 

3247 

0 

NA 

0.47 

NA 

224 

GRIA2 

158361186 

158506677 

9 

1 

3 

4017824 

-48887 

-48887 

0 

NA 

NA 

0.17 

225 

PDLIM2 

22492199 

22511483 

0 

1 

26 

1584 

-3247 

1584 

0 

NA 

0.42 

NA 

226 

MTMR7 

17199923 

17315207 

0 

1 

21 

83768 

-1533557 

83768 

0 

0.85 

NA 

NA 

227 

FBX024 

100021892 

100036674 

0 

1 

12 

1144 

-180 

-180 

0 

NA 

NA 

0.24 

228 

CRISPLD1 

76059531 

76109346 

0 

1 

35 

373386 

-129712 

-129712 

0 

1.63 

NA 

NA 

229 

DPYS 

105460829 

1 05548453 

0 

1 

41 

22190 

-127566 

22190 

0 

0.57 

NA 

NA 

230 

DTNA 

30327279 

30725806 

62 

1 

67 

17395350 

-269766 

-269766 

0 

NA 

NA 

0.15 

231 

KLHDC4 

86298920 

86357056 

0 

1 

58 

64075 

-9657 

-9657 

0 

NA 

0.25 

NA 

232 

CYBA 

87237199 

87244958 

0 

1 

58 

891 

-2814 

891 

0 

NA 

0.42 

NA 

233 

JPH3 

86194000 

86289263 

0 

1 

58 

9657 

-1036491 

9657 

0 

0.21 

0.21 

NA 

234 

TMEM120A 

75454238 

75461913 

0 

1 

10 

1679 

-248023 

1679 

0 

NA 

NA 

0.00 

235 

MTUS1 

17545584 

17702666 

1 

1 

21 

121980 

71 

71 

0 

0.80 

NA 

NA 

236 

C8orf34 

6940551 1 

69893810 

0 

1 

33 

647617 

-99060 

-99060 

0 

2.37 

NA 

NA 

237 

GRHL2 

102574162 

102750995 

0 

1 

40 

16952 

-287026 

16952 

0 

NA 

0.21 

NA 

238 

CPA2 

129693939 

129716870 

0 

1 

13 

3360 

-29534682 

3360 

0 

NA 

NA 

0.11 
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239 

NAT2 

115 

3.3 

-1 

993 

NA 

NA 

NA 

NA 

NA 

NA 

140 

134 

134 

8 

p22 

240 

DPYSL2 

148 

3.3 

-1 

967 

NA 

NA 

NA 

NA 

NA 

NA 

155 

122 

135.5 

8 

p21 .2 

241 

ZDHHC7 

300 

NA 

NA 

NA 

2.5 

-1 

839 

NA 

NA 

NA 

159 

123 

138 

16 

q24.1 

242 

ELP3 

158 

3.4 

-1 

939 

NA 

NA 

NA 

NA 

NA 

NA 

166 

118 

139 

8 

p21 .1 

243 

RHOBTB2 

136 

NA 

NA 

NA 

1.7 

-1 

501 

NA 

NA 

NA 

133 

150 

142 

8 

p21 .3 

244 

NEIL2 

103 

2.7 

-1 

921 

NA 

NA 

NA 

NA 

NA 

NA 

150 

135 

143 

8 

p23.1 

245 

HR 

122 

NA 

NA 

NA 

2.7 

-1 

896 

NA 

NA 

NA 

186 

112 

145 

8 

p21 .3 

246 

EFR3A 

226 

3.1 

1 

985 

NA 

NA 

NA 

NA 

NA 

NA 

144 

146 

146 

8 

q24.22 

247 

STMN4 

150 

3.3 

-1 

994 

NA 

NA 

NA 

NA 

NA 

NA 

162 

131 

147 

8 

p21 .2 

248 

PRDM14 

168 

4.7 

1 

996 

NA 

NA 

NA 

NA 

NA 

NA 

135 

171 

152 

8 

q13.3 

249 

MARVELD2 

35 

NA 

NA 

NA 

3 

-1 

988 

NA 

NA 

NA 

142 

164 

154 

5 

q13.2 

250 

SLC39A14 

128 

1.8 

-1 

560 

2.2 

-1 

791 

NA 

NA 

NA 

152 

160 

155 

8 

p21 .3 

251 

ACTL6B 

80 

NA 

NA 

NA 

NA 

NA 

NA 

1 .7362 

1 

538 

168 

158 

159 

7 

q22.1 

252 

TUSC3 

108 

3.1 

-1 

945 

NA 

NA 

NA 

NA 

NA 

NA 

157 

170 

160 

8 

p22 

253 

COX4NB 

305 

NA 

NA 

NA 

2.5 

-1 

938 

NA 

NA 

NA 

148 

181 

161 

16 

q24.1 

254 

XKR9 

171 

2.7 

1 

929 

NA 

NA 

NA 

NA 

NA 

NA 

165 

163 

162 

8 

q13.3 

255 

C16orf46 

281 

NA 

NA 

NA 

2.2 

-1 

768 

NA 

NA 

NA 

151 

183 

164 

16 

q23.2 

256 

TAF9 

33 

NA 

NA 

NA 

2.6 

-1 

963 

NA 

NA 

NA 

175 

162 

166 

5 

q13.2 

257 

KCNQ3 

228 

6 

1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

167 

180 

169 

8 

q24.22 

258 

UTRN 

50 

NA 

NA 

NA 

NA 

NA 

NA 

2.3296 

-1 

766 

174 

174 

170 

6 

q24.2 

259 

RADI  7 

34 

NA 

NA 

NA 

2.6 

-1 

969 

NA 

NA 

NA 

172 

182 

172 

5 

q13.2 

260 

ZFPM1 

315 

NA 

NA 

NA 

2.5 

-1 

924 

NA 

NA 

NA 

146 

219 

175 

16 

q24.2 

261 

PTDSS1 

197 

2.5 

1 

874 

NA 

NA 

NA 

NA 

NA 

NA 

184 

177 

179 

8 

q22.1 

262 

IRF8 

307 

NA 

NA 

NA 

2.5 

-1 

976 

NA 

NA 

NA 

199 

169 

181 

16 

q24.1 

263 

YWHAZ 

204 

NA 

NA 

NA 

2.2 

1 

722 

NA 

NA 

NA 

204 

166 

182 

8 

q22.3 

264 

MRPS36 

30 

NA 

NA 

NA 

2.6 

-1 

962 

NA 

NA 

NA 

195 

175 

183 

5 

q13.2 

265 

LACTB2 

170 

2.6 

1 

932 

NA 

NA 

NA 

NA 

NA 

NA 

160 

223 

187 

8 

ql  3.3 

266 

SNAI3 

321 

NA 

NA 

NA 

2.4 

-1 

914 

NA 

NA 

NA 

231 

156 

190 

16 

q24.3 

267 

TMEM71 

229 

2.9 

1 

993 

NA 

NA 

NA 

NA 

NA 

NA 

180 

207 

194 

8 

q24.22 

268 

PREX2 

164 

7.5 

1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

190 

199 

195 

8 

q13.2 

269 

CPA1 

86 

NA 

NA 

NA 

NA 

NA 

NA 

2.0683 

1 

716 

228 

173 

199 

7 

q32.2 

270 

PHF20L1 

230 

2.8 

1 

901 

NA 

NA 

NA 

NA 

NA 

NA 

198 

200 

200 

8 

q24.22 

271 

KIAA051 3 

301 

NA 

NA 

NA 

2.1 

-1 

816 

NA 

NA 

NA 

212 

188 

202 

16 

q24.1 

272 

P1 1 5 

181 

3 

1 

991 

NA 

NA 

NA 

NA 

NA 

NA 

238 

179 

206 

8 

q21 .1 1 

273 

PCM1 

113 

1.7 

-1 

529 

NA 

NA 

NA 

NA 

NA 

NA 

183 

234 

207 

8 

p22 

274 

SH2D4A 

117 

2.9 

-1 

908 

NA 

NA 

NA 

NA 

NA 

NA 

249 

172 

208 

8 

p21 .3 

275 

C16orf74 

304 

NA 

NA 

NA 

2.3 

-1 

939 

NA 

NA 

NA 

202 

214 

209 

16 

q24.1 

276 

TP63 

18 

NA 

NA 

NA 

3 

1 

822 

NA 

NA 

NA 

203 

228 

211 

3 

q28 

277 

DACH1 

254 

NA 

NA 

NA 

NA 

NA 

NA 

1 .8675 

-1 

570 

252 

185 

212 

13 

q21 .33 

278 

TNFRSF10A 

138 

NA 

NA 

NA 

2.2 

-1 

774 

NA 

NA 

NA 

245 

196 

214 

8 

p21 .3 
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239 

NAT2 

18293035 

18303003 

0 

1 

23 

126090 

-306248 

126090 

0 

0.63 

NA 

NA 

240 

DPYSL2 

26491327 

26571607 

0 

1 

30 

89977 

-533035 

89977 

0 

0.63 

NA 

NA 

241 

ZDHHC7 

83565573 

83602642 

0 

1 

56 

16269 

-64959 

16269 

0 

NA 

0.25 

NA 

242 

ELP3 

27999759 

28104584 

6 

1 

31 

699246 

-2452 

-2452 

0 

0.68 

NA 

NA 

243 

RHOBTB2 

22913059 

22933655 

2 

1 

26 

1 1 5396 

-306299 

1 1 5396 

0 

NA 

0.00 

NA 

244 

NEIL2 

1 1 664627 

1 1 682263 

1 

1 

18 

55179 

-9709 

-9709 

0 

0.33 

NA 

NA 

245 

HR 

22027877 

22045326 

0 

1 

24 

6152 

-4474 

-4474 

0 

NA 

0.33 

NA 

246 

EFR3A 

132985517 

133095071 

0 

1 

45 

10596 

-861663 

10596 

0 

0.52 

NA 

NA 

247 

STMN4 

27149738 

27171843 

0 

1 

30 

26478 

-370899 

26478 

0 

0.63 

NA 

NA 

248 

PRDM14 

71126574 

71146116 

0 

1 

33 

32264 

-216812 

32264 

0 

1.49 

NA 

NA 

249 

MARVELD2 

68746699 

68773646 

82 

1 

4 

1 9278276 

-315 

-315 

0 

NA 

0.47 

NA 

250 

SLC39A14 

22280737 

22347462 

0 

1 

26 

7079 

-116113 

7079 

0 

0.02 

0.14 

NA 

251 

ACTL6B 

100078678 

100092007 

1 

1 

12 

23059 

-1569 

-1569 

0 

NA 

NA 

0.01 

252 

TUSC3 

15442101 

15666366 

6 

1 

20 

1 533557 

-301882 

-301882 

0 

0.52 

NA 

NA 

253 

COX4NB 

84369737 

84390601 

0 

1 

57 

96 

-27547 

96 

0 

NA 

0.25 

NA 

254 

XKR9 

71 755848 

71809213 

0 

1 

34 

463009 

-11902 

-11902 

0 

0.33 

NA 

NA 

255 

Cl  6orf46 

79644603 

79668373 

0 

1 

53 

5057 

-248923 

5057 

0 

NA 

0.14 

NA 

256 

TAF9 

68696327 

68701596 

0 

1 

4 

-716 

-31935 

-716 

0 

NA 

0.29 

NA 

257 

KCNQ3 

133210438 

133561961 

1 

1 

45 

217672 

-43354 

-43354 

0 

2.37 

NA 

NA 

258 

UTRN 

144654566 

145215859 

0 

1 

7 

772282 

-20654629 

772282 

0 

NA 

NA 

0.18 

259 

RADI  7 

68700880 

68746384 

0 

1 

4 

315 

716 

315 

0 

NA 

0.29 

NA 

260 

ZFPM1 

87047226 

87128890 

0 

1 

58 

18723 

-378801 

18723 

0 

NA 

0.25 

NA 

261 

PTDSS1 

97343340 

97415950 

0 

1 

39 

159108 

-2053354 

159108 

0 

0.25 

NA 

NA 

262 

IRF8 

84490275 

84513710 

0 

1 

57 

587924 

-92166 

-92166 

0 

NA 

0.25 

NA 

263 

YWHAZ 

101999980 

1 02034745 

0 

1 

40 

243699 

-812460 

243699 

0 

NA 

0.14 

NA 

264 

MRPS36 

68549329 

6857771 0 

0 

1 

4 

-11239 

-7390 

-7390 

0 

NA 

0.29 

NA 

265 

LACTB2 

71712045 

71743946 

0 

1 

34 

11902 

-233471 

11902 

0 

0.29 

NA 

NA 

266 

SNAI3 

87271591 

87280383 

0 

1 

58 

10028 

-14572 

10028 

0 

NA 

0.21 

NA 

267 

TMEM71 

133779633 

133842010 

0 

1 

46 

14776 

-217672 

14776 

0 

0.42 

NA 

NA 

268 

PREX2 

69026907 

69306451 

0 

1 

33 

99060 

-205773 

99060 

0 

3.36 

NA 

NA 

269 

CPA1 

129807468 

129815165 

0 

1 

13 

8446 

-11661 

8446 

0 

NA 

NA 

0.09 

270 

PHF20L1 

133856786 

133930234 

0 

1 

46 

18153 

-14776 

-14776 

0 

0.38 

NA 

NA 

271 

KIAA0513 

83618911 

83685327 

2 

1 

56 

517197 

-16269 

-16269 

0 

NA 

0.10 

NA 

272 

P1 1 5 

75899327 

7592981 9 

0 

1 

35 

129712 

-457439 

129712 

0 

0.47 

NA 

NA 

273 

PCM1 

17824646 

17935562 

0 

1 

22 

22652 

-121980 

22652 

0 

0.00 

NA 

NA 

274 

SH2D4A 

19215483 

19297594 

5 

1 

23 

850362 

-300007 

-300007 

0 

0.42 

NA 

NA 

275 

C16orf74 

84298624 

84342190 

0 

1 

57 

27547 

-18535 

-18535 

0 

NA 

0.17 

NA 

276 

TP63 

190831910 

191107935 

0 

1 

2 

49278 

-3269193 

49278 

0 

NA 

0.47 

NA 

277 

DACH1 

70910099 

71339331 

28 

1 

49 

1 9509588 

-1329507 

-1329507 

0 

NA 

NA 

0.04 

278 

TNFRSF10A 

23104916 

23138584 

0 

1 

27 

18511 

-27431 

18511 

0 

NA 

0.14 

NA 
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279 

MDH2 

66 

NA 

NA 

NA 

NA 

NA 

NA 

1.9653 

1 

728 

236 

208 

218 

7 

ql  1 .23 

280 

PAG1 

189 

NA 

NA 

NA 

2 

1 

776 

NA 

NA 

NA 

173 

290 

221 

8 

q21 .13 

281 

SLC25A37 

142 

2.6 

-1 

845 

NA 

NA 

NA 

NA 

NA 

NA 

226 

222 

222 

8 

p21 .2 

282 

BCAR1 

273 

2.5 

-1 

846 

NA 

NA 

NA 

NA 

NA 

NA 

240 

213 

225 

16 

q23.1 

283 

COX4I1 

306 

NA 

NA 

NA 

2.6 

-1 

911 

NA 

NA 

NA 

178 

289 

226 

16 

q24.1 

284 

EIF4H 

59 

NA 

NA 

NA 

NA 

NA 

NA 

2.0065 

1 

775 

224 

236 

227 

7 

ql  1 .23 

285 

ZC3H18 

317 

NA 

NA 

NA 

2.1 

-1 

878 

NA 

NA 

NA 

217 

244 

228 

16 

q24.2 

286 

STMN2 

186 

2.8 

1 

962 

NA 

NA 

NA 

NA 

NA 

NA 

284 

198 

230 

8 

q21 .13 

287 

AFG3L1 

335 

NA 

NA 

NA 

2.3 

-1 

947 

NA 

NA 

NA 

254 

224 

231 

16 

q24.3 

288 

HSD17B2 

287 

2.6 

-1 

791 

NA 

NA 

NA 

NA 

NA 

NA 

229 

259 

236 

16 

q23.3 

289 

MVD 

320 

NA 

NA 

NA 

2.3 

-1 

901 

NA 

NA 

NA 

223 

266 

237 

16 

q24.3 

290 

DLC1 

106 

6.5 

-1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

207 

288 

238 

8 

p22 

291 

EPHA7 

44 

NA 

NA 

NA 

NA 

NA 

NA 

1 .7755 

-1 

529 

237 

252 

239 

6 

ql  6.1 

292 

TRIM35 

151 

2.6 

-1 

926 

NA 

NA 

NA 

NA 

NA 

NA 

209 

287 

241 

8 

p21 .2 

293 

LRRC50 

293 

2.4 

-1 

830 

NA 

NA 

NA 

NA 

NA 

NA 

232 

262 

243 

16 

q24.1 

294 

CNGB3 

192 

1.8 

1 

534 

NA 

NA 

NA 

NA 

NA 

NA 

319 

191 

244 

8 

q21 .3 

295 

ASCC3 

47 

NA 

NA 

NA 

NA 

NA 

NA 

1 .7954 

-1 

535 

246 

249 

245 

6 

ql  6.3 

296 

RFC2 

61 

NA 

NA 

NA 

NA 

NA 

NA 

1 .8399 

1 

625 

208 

295 

246 

7 

ql  1 .23 

297 

CLEC3A 

278 

2.3 

-1 

781 

NA 

NA 

NA 

NA 

NA 

NA 

267 

232 

247 

16 

q23.1 

298 

IL17C 

318 

NA 

NA 

NA 

1.8 

-1 

639 

NA 

NA 

NA 

244 

256 

249 

16 

q24.3 

299 

BMP1 

125 

NA 

NA 

NA 

2.2 

-1 

819 

NA 

NA 

NA 

259 

242 

250 

8 

p21 .3 

300 

CPA4 

84 

NA 

NA 

NA 

NA 

NA 

NA 

1.9432 

1 

632 

242 

261 

251 

7 

q32.2 

301 

OC90 

227 

1.9 

1 

640 

NA 

NA 

NA 

NA 

NA 

NA 

262 

243 

252 

8 

q24.22 

302 

HEPH 

364 

1.8 

1 

537 

NA 

NA 

NA 

NA 

NA 

NA 

292 

220 

253 

23 

ql  2 

303 

LRP12 

212 

NA 

NA 

NA 

2 

1 

635 

NA 

NA 

NA 

277 

233 

254 

8 

q22.3 

304 

AGFG2 

74 

NA 

NA 

NA 

NA 

NA 

NA 

2.2839 

1 

749 

317 

212 

257 

7 

q22.1 

305 

TRPA1 

174 

2.3 

1 

803 

NA 

NA 

NA 

NA 

NA 

NA 

257 

263 

258 

8 

ql  3.3 

306 

GINS2 

303 

NA 

NA 

NA 

2.1 

-1 

861 

NA 

NA 

NA 

268 

253 

260 

16 

q24.1 

307 

CENPH 

29 

NA 

NA 

NA 

1.9 

-1 

693 

NA 

NA 

NA 

286 

238 

261 

5 

q13.2 

308 

KLHL36 

297 

NA 

NA 

NA 

1.8 

-1 

606 

NA 

NA 

NA 

222 

312 

263 

16 

q24.1 

309 

ARHGEF10L 

2 

NA 

NA 

NA 

2.1 

-1 

730 

NA 

NA 

NA 

258 

269 

264 

1 

p36.13 

310 

TRAPPC2L 

326 

NA 

NA 

NA 

1.9 

-1 

670 

NA 

NA 

NA 

302 

230 

265 

16 

q24.3 

311 

TCF25 

332 

NA 

NA 

NA 

2.1 

-1 

821 

NA 

NA 

NA 

272 

264 

267 

16 

q24.3 

312 

TNFRSF10D 

137 

1.9 

-1 

603 

NA 

NA 

NA 

NA 

NA 

NA 

288 

250 

268 

8 

p21 .3 

313 

MYOM2 

93 

2.1 

-1 

705 

NA 

NA 

NA 

NA 

NA 

NA 

295 

245 

269 

8 

p23.3 

314 

GCSH 

282 

NA 

NA 

NA 

1.9 

-1 

673 

NA 

NA 

NA 

248 

296 

272 

16 

q23.2 

315 

KIAA1609 

296 

NA 

NA 

NA 

1.9 

-1 

641 

NA 

NA 

NA 

260 

284 

274 

16 

q24.1 

316 

FANCA 

330 

NA 

NA 

NA 

1.9 

-1 

612 

NA 

NA 

NA 

299 

247 

275 

16 

q24.3 

317 

ERI1 

96 

1.9 

-1 

607 

NA 

NA 

NA 

NA 

NA 

NA 

312 

239 

276 

8 

p23.1 

318 

HSDL1 

292 

NA 

NA 

NA 

2 

-1 

685 

NA 

NA 

NA 

273 

278 

278 

16 

q24.1 
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279 

MDH2 

75515329 

75533864 

2 

1 

10 

260189 

-72 

-72 

0 

NA 

NA 

0.06 

280 

PAG1 

82042605 

82186858 

8 

1 

37 

545893 

-93034 

-93034 

0 

NA 

0.07 

NA 

281 

SLC25A37 

23442308 

23486008 

1 

1 

28 

129901 

-71227 

-71227 

0 

0.29 

NA 

NA 

282 

BCAR1 

73820429 

73859452 

0 

1 

51 

25657 

-24391 1 

25657 

0 

0.25 

NA 

NA 

283 

COX4I1 

84390697 

84398109 

0 

1 

57 

92166 

-96 

-96 

0 

NA 

0.29 

NA 

284 

EIF4H 

73226625 

73249358 

0 

1 

9 

12304 

-404089 

12304 

0 

NA 

NA 

0.07 

285 

ZC3H18 

87164343 

87225756 

0 

1 

58 

6746 

-294 

-294 

0 

NA 

0.10 

NA 

286 

STMN2 

80685916 

80740868 

0 

1 

36 

97933 

-1007876 

97933 

0 

0.38 

NA 

NA 

287 

AFG3L1 

88566489 

88594696 

1 

1 

63 

21813 

-4521 

-4521 

0 

NA 

0.17 

NA 

288 

HSD17B2 

80626364 

80689638 

1 

1 

53 

750123 

-76965 

-76965 

0 

0.29 

NA 

NA 

289 

MVD 

87245849 

87257019 

0 

1 

58 

14572 

-891 

-891 

0 

NA 

0.17 

NA 

290 

DLC1 

12985243 

13416766 

1 

1 

19 

574978 

-53590 

-53590 

0 

2.71 

NA 

NA 

291 

EPHA7 

94007864 

94185993 

9 

1 

5 

5242062 

-2654236 

-2654236 

0 

NA 

NA 

0.01 

292 

TRIM35 

27198321 

27224751 

0 

1 

30 

165 

-26478 

165 

0 

0.29 

NA 

NA 

293 

LRRC50 

82736366 

82769024 

3 

1 

54 

1 1 6798 

-101 

-101 

0 

0.21 

NA 

NA 

294 

CNGB3 

87655277 

87825017 

0 

1 

38 

122823 

-1691298 

122823 

0 

0.02 

NA 

NA 

295 

ASCC3 

101062791 

101435961 

79 

1 

6 

1 6667349 

-43297 

-43297 

0 

NA 

NA 

0.02 

296 

RFC2 

73283770 

73306674 

0 

1 

9 

35065 

-1671 

-1671 

0 

NA 

NA 

0.03 

297 

CLEC3A 

76613944 

76623495 

0 

1 

52 

67557 

-280292 

67557 

0 

0.17 

NA 

NA 

298 

IL17C 

87232502 

87234385 

0 

1 

58 

2814 

-6746 

2814 

0 

NA 

0.02 

NA 

299 

BMP1 

22078645 

22125782 

0 

1 

25 

7380 

-8355 

7380 

0 

NA 

0.14 

NA 

300 

CPA4 

129720230 

129751249 

0 

1 

13 

20643 

-3360 

-3360 

0 

NA 

NA 

0.06 

301 

OC90 

133105667 

133167084 

0 

1 

45 

43354 

-10596 

-10596 

0 

0.05 

NA 

NA 

302 

HEPH 

65299388 

65403956 

0 

1 

69 

328248 

NA 

328248 

0 

0.02 

NA 

NA 

303 

LRP12 

105570643 

1 05670344 

0 

1 

41 

729979 

-22190 

-22190 

0 

NA 

0.07 

NA 

304 

AGFG2 

99974770 

1 00003778 

0 

1 

12 

5792 

-44412 

5792 

0 

NA 

NA 

0.16 

305 

TRPA1 

73096040 

73150373 

0 

1 

34 

492151 

-176755 

-176755 

0 

0.17 

NA 

NA 

306 

GINS2 

84268782 

84280089 

0 

1 

57 

18535 

-1471 

-1471 

0 

NA 

0.10 

NA 

307 

CENPH 

68521131 

68541939 

0 

1 

4 

7390 

NA 

7390 

0 

NA 

0.05 

NA 

308 

KLHL36 

83239632 

8325341 6 

0 

1 

56 

37634 

-143838 

37634 

0 

NA 

0.02 

NA 

309 

ARHGEF10L 

17738917 

17896956 

0 

1 

1 

57439 

-1482527 

57439 

0 

NA 

0.10 

NA 

310 

TRAPPC2L 

87451007 

87455020 

0 

1 

60 

13748 

-122 

-122 

0 

NA 

0.05 

NA 

311 

TCF25 

88467520 

88505287 

0 

1 

62 

7881 

-2292 

-2292 

0 

NA 

0.10 

NA 

312 

TNFRSF10D 

23049051 

23077485 

0 

1 

27 

27431 

-115396 

27431 

0 

0.05 

NA 

NA 

313 

MYOM2 

1 980565 

2080779 

0 

1 

14 

699503 

-86359 

-86359 

0 

0.10 

NA 

NA 

314 

GCSH 

79673430 

79687481 

0 

1 

53 

4504 

-5057 

4504 

0 

NA 

0.05 

NA 

315 

KIAA1609 

83068608 

83095794 

1 

1 

55 

143838 

-13315 

-13315 

0 

NA 

0.05 

NA 

316 

FANCA 

88331460 

88410566 

0 

1 

62 

11842 

-246990 

11842 

0 

NA 

0.05 

NA 

317 

ERI1 

8897856 

8928139 

1 

1 

15 

522716 

-109315 

-109315 

0 

0.05 

NA 

NA 

318 

HSDL1 

82713389 

82736265 

0 

1 

54 

101 

-5371 

101 

0 

NA 

0.07 

NA 
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319 

KIAA0182 

302 

NA 

NA 

NA 

2 

-1 

781 

NA 

NA 

NA 

305 

251 

281 

16 

q24.1 

320 

CBFA2T3 

327 

NA 

NA 

NA 

1.9 

-1 

698 

NA 

NA 

NA 

274 

297 

286 

16 

q24.3 

321 

EGR3 

135 

NA 

NA 

NA 

2 

-1 

751 

NA 

NA 

NA 

308 

267 

289 

8 

p21 .3 

322 

PCOLCE 

77 

NA 

NA 

NA 

NA 

NA 

NA 

1.8050 

1 

608 

294 

281 

290 

7 

q22.1 

323 

Cl  6orf85 

316 

NA 

NA 

NA 

2.1 

-1 

801 

NA 

NA 

NA 

290 

291 

295 

16 

q24.2 

324 

HMBOX1 

159 

1.8 

-1 

553 

NA 

NA 

NA 

NA 

NA 

NA 

287 

306 

300 

8 

p21 .1 

325 

MTMR9 

100 

1.9 

-1 

674 

NA 

NA 

NA 

NA 

NA 

NA 

343 

257 

301 

8 

p23.1 

326 

MSC 

173 

2 

1 

675 

NA 

NA 

NA 

NA 

NA 

NA 

291 

305 

302 

8 

ql  3.3 

327 

ST3GAL2 

269 

2.4 

-1 

774 

NA 

NA 

NA 

NA 

NA 

NA 

269 

340 

308 

16 

q22.1 

328 

FOXF1 

308 

NA 

NA 

NA 

2.2 

-1 

894 

NA 

NA 

NA 

344 

270 

309 

16 

q24.1 

329 

C8orf58 

132 

NA 

NA 

NA 

3 

-1 

999 

NA 

NA 

NA 

334 

279 

310 

8 

p21 .3 

330 

KCTD9 

145 

2 

-1 

663 

NA 

NA 

NA 

NA 

NA 

NA 

271 

344 

311 

8 

p21 .2 

331 

ANGPT1 

214 

2.4 

1 

816 

NA 

NA 

NA 

NA 

NA 

NA 

333 

282 

313 

8 

q23.1 

332 

GDAP1 

180 

2 

1 

663 

NA 

NA 

NA 

NA 

NA 

NA 

283 

333 

314 

8 

q21 .1 1 

333 

RNF166 

322 

NA 

NA 

NA 

2.2 

-1 

877 

NA 

NA 

NA 

263 

360 

315 

16 

q24.3 

334 

KLHL1 

253 

NA 

NA 

NA 

NA 

NA 

NA 

1.8637 

-1 

566 

293 

325 

318 

13 

q21 .33 

335 

LOXL2 

140 

NA 

NA 

NA 

1.9 

-1 

675 

NA 

NA 

NA 

322 

298 

319 

8 

p21 .3 

336 

WISP1 

233 

2.2 

1 

777 

NA 

NA 

NA 

NA 

NA 

NA 

280 

343 

320 

8 

q24.22 

337 

C8orf80 

157 

3.6 

-1 

957 

NA 

NA 

NA 

NA 

NA 

NA 

357 

274 

323 

8 

p21 .1 

338 

LAT2 

60 

NA 

NA 

NA 

NA 

NA 

NA 

1 .9646 

1 

697 

328 

300 

324 

7 

ql  1 .23 

339 

USP10 

298 

2.3 

-1 

691 

NA 

NA 

NA 

NA 

NA 

NA 

321 

310 

326 

16 

q24.1 

340 

CDH15 

328 

NA 

NA 

NA 

1.9 

-1 

673 

NA 

NA 

NA 

330 

303 

328 

16 

q24.3 

341 

WFDC1 

294 

2.3 

-1 

713 

NA 

NA 

NA 

NA 

NA 

NA 

311 

327 

329 

16 

q24.1 

342 

C7orf51 

73 

NA 

NA 

NA 

NA 

NA 

NA 

2.1914 

1 

773 

307 

339 

333 

7 

q22.1 

343 

EBF2 

147 

5.1 

-1 

999 

NA 

NA 

NA 

NA 

NA 

NA 

309 

337 

334 

8 

p21 .2 

344 

CCDC125 

32 

NA 

NA 

NA 

2 

-1 

721 

NA 

NA 

NA 

336 

319 

337 

5 

q13.2 

345 

LG  13 

124 

NA 

NA 

NA 

2 

-1 

678 

NA 

NA 

NA 

332 

323 

338 

8 

p21 .3 

346 

NUDT18 

121 

NA 

NA 

NA 

2.3 

-1 

786 

NA 

NA 

NA 

314 

354 

340 

8 

p21 .3 

347 

PHYHIP 

126 

NA 

NA 

NA 

2.2 

-1 

860 

NA 

NA 

NA 

361 

308 

341 

8 

p21 .3 

348 

PILRA 

70 

NA 

NA 

NA 

NA 

NA 

NA 

1.8998 

1 

701 

353 

318 

342 

7 

q22.1 

349 

KAT2A 

340 

NA 

NA 

NA 

NA 

NA 

NA 

3.1978 

1 

993 

318 

357 

343 

17 

q21 .2 

350 

CSMD3 

216 

4.9 

1 

998 

4.2 

1 

809 

NA 

NA 

NA 

351 

324 

344 

8 

q23.3 

351 

REEP4 

123 

NA 

NA 

NA 

2.5 

-1 

847 

NA 

NA 

NA 

324 

352 

345 

8 

p21 .3 

352 

TUBB3 

333 

NA 

NA 

NA 

2.6 

-1 

843 

NA 

NA 

NA 

348 

328 

346 

16 

q24.3 

353 

CDT1 

324 

NA 

NA 

NA 

2 

-1 

745 

NA 

NA 

NA 

365 

313 

347 

16 

q24.3 

354 

EDA2R 

365 

2 

1 

629 

NA 

NA 

NA 

NA 

NA 

NA 

349 

331 

348 

23 

ql  2 

355 

DUS1L 

347 

NA 

NA 

NA 

NA 

NA 

NA 

2.2705 

1 

904 

364 

322 

350 

17 

q25.3 

356 

LRCH4 

75 

NA 

NA 

NA 

NA 

NA 

NA 

2.2304 

1 

831 

342 

349 

351 

7 

q22.1 

357 

TMEM75 

223 

3.5 

1 

992 

NA 

NA 

NA 

NA 

NA 

NA 

337 

356 

352 

8 

q24.21 

358 

NUDT7 

277 

2.2 

-1 

730 

NA 

NA 

NA 

NA 

NA 

NA 

355 

338 

353 

16 

q23.1 
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319 

KIAA0182 

84202524 

8426731 1 

0 

1 

57 

1471 

-517197 

1471 

0 

NA 

0.07 

NA 

320 

CBFA2T3 

87468768 

87570902 

2 

1 

60 

194762 

-13748 

-13748 

0 

NA 

0.05 

NA 

321 

EGR3 

22601119 

22606760 

0 

1 

26 

306299 

-18566 

-18566 

0 

NA 

0.07 

NA 

322 

PCOLCE 

100037818 

1 00043732 

0 

1 

12 

3929 

-1144 

-1144 

0 

NA 

NA 

0.02 

323 

C16orf85 

87147613 

87164049 

0 

1 

58 

294 

-18723 

294 

0 

NA 

0.10 

NA 

324 

HMBOX1 

28803830 

28966706 

0 

1 

32 

14009 

-699246 

14009 

0 

0.02 

NA 

NA 

325 

MTMR9 

11179410 

11223062 

6 

1 

17 

165868 

-154255 

-154255 

0 

0.05 

NA 

NA 

326 

MSC 

72916332 

72919285 

0 

1 

34 

176755 

-479311 

176755 

0 

0.07 

NA 

NA 

327 

ST3GAL2 

68970839 

69030492 

28 

1 

50 

2343793 

-6059 

-6059 

0 

0.21 

NA 

NA 

328 

FOXF1 

85101634 

85105570 

0 

1 

57 

15714 

-587924 

15714 

0 

NA 

0.14 

NA 

329 

C8orf58 

22513067 

22517605 

0 

1 

26 

597 

-1584 

597 

0 

NA 

0.47 

NA 

330 

KCTD9 

25341283 

25371837 

0 

1 

29 

591 

-14747 

591 

0 

0.07 

NA 

NA 

331 

ANGPT1 

108330899 

1 08579459 

0 

1 

42 

401262 

-1444960 

401262 

0 

0.21 

NA 

NA 

332 

GDAP1 

75425173 

75441888 

0 

1 

35 

457439 

-29056 

-29056 

0 

0.07 

NA 

NA 

333 

RNF166 

8729041 1 

87300312 

1 

1 

58 

2604 

-10028 

2604 

0 

NA 

0.14 

NA 

334 

KLHL1 

69172727 

69580592 

0 

1 

49 

1 329507 

-2470149 

1329507 

0 

NA 

NA 

0.04 

335 

LOXL2 

23210097 

23317667 

0 

1 

28 

-18281 

-34647 

-18281 

0 

NA 

0.05 

NA 

336 

WISP1 

134272494 

134310751 

2 

1 

46 

1 248464 

-88015 

-88015 

0 

0.14 

NA 

NA 

337 

C8orf80 

27935607 

27997307 

0 

1 

31 

2452 

-29490 

2452 

0 

0.80 

NA 

NA 

338 

LAT2 

73261662 

73282099 

0 

1 

9 

1671 

-12304 

1671 

0 

NA 

NA 

0.06 

339 

USP10 

83291050 

83371026 

0 

1 

56 

40087 

-37634 

-37634 

0 

0.17 

NA 

NA 

340 

CDH15 

87765664 

87789400 

0 

1 

61 

72136 

-194762 

72136 

0 

NA 

0.05 

NA 

341 

WFDC1 

82885822 

82920888 

0 

1 

55 

38746 

-116798 

38746 

0 

0.17 

NA 

NA 

342 

C7orf51 

99919486 

99930358 

0 

1 

12 

44412 

-4648 

-4648 

0 

NA 

NA 

0.13 

343 

EBF2 

25758042 

25958292 

2 

1 

29 

533035 

-336689 

-336689 

0 

1.76 

NA 

NA 

344 

CCDC125 

68612278 

68664392 

0 

1 

4 

31935 

-3274 

-3274 

0 

NA 

0.07 

NA 

345 

LG  13 

22060290 

22070290 

1 

1 

24 

8355 

-4897 

-4897 

0 

NA 

0.07 

NA 

346 

NUDT18 

22020328 

22023403 

0 

1 

24 

4474 

-2493 

-2493 

0 

NA 

0.17 

NA 

347 

PHYHIP 

22133162 

22145796 

0 

1 

25 

12768 

-7380 

-7380 

0 

NA 

0.14 

NA 

348 

PILRA 

99809004 

99835650 

1 

1 

11 

29540 

-5616 

-5616 

0 

NA 

NA 

0.05 

349 

KAT2A 

37518657 

37526872 

0 

1 

64 

1489 

-380 

-380 

0 

NA 

NA 

0.57 

350 

CSMD3 

113304337 

114518418 

0 

1 

43 

1971482 

-4139285 

1971482 

0 

1.63 

1.16 

NA 

351 

REEP4 

22051478 

22055393 

0 

1 

24 

4897 

-6152 

4897 

0 

NA 

0.25 

NA 

352 

TUBB3 

88513168 

88530006 

1 

1 

62 

12678 

-7881 

-7881 

0 

NA 

0.29 

NA 

353 

CDT1 

87397687 

87403166 

1 

1 

59 

4478 

-67370 

4478 

0 

NA 

0.07 

NA 

354 

EDA2R 

65732204 

65775608 

0 

1 

69 

904991 

-328248 

-328248 

0 

0.07 

NA 

NA 

355 

DUS1L 

77609043 

77629242 

0 

1 

66 

262 

-40474 

262 

0 

NA 

NA 

0.16 

356 

LRCH4 

100009570 

100021712 

0 

1 

12 

180 

-5792 

180 

0 

NA 

NA 

0.14 

357 

TMEM75 

129029046 

129029462 

2 

1 

44 

2104073 

-206193 

-206193 

0 

0.74 

NA 

NA 

358 

NUDT7 

76313912 

76333652 

0 

1 

52 

280292 

-287400 

280292 

0 

0.14 

NA 

NA 

Table  6  (10a) 


359 

TSGA14 

87 

NA 

NA 

NA 

NA 

NA 

NA 

9.3754 

1 

966 

354 

341 

354 

7 

q32.2 

360 

CDC42BPG 

242 

NA 

NA 

NA 

NA 

NA 

NA 

2.3279 

1 

813 

360 

336 

355 

11 

ql  3.1 

361 

TSC22D4 

72 

NA 

NA 

NA 

NA 

NA 

NA 

2.1304 

1 

867 

341 

359 

356 

7 

q22.1 

362 

NOTUM 

345 

NA 

NA 

NA 

NA 

NA 

NA 

2.6756 

1 

963 

358 

348 

358 

17 

q25.3 

363 

HSPB9 

341 

NA 

NA 

NA 

NA 

NA 

NA 

2.9366 

1 

987 

346 

361 

360 

17 

q21 .2 

364 

TFR2 

79 

NA 

NA 

NA 

NA 

NA 

NA 

2.6230 

1 

950 

352 

355 

361 

7 

q22.1 

365 

SLA 

232 

2.2 

1 

786 

NA 

NA 

NA 

NA 

NA 

NA 

347 

365 

362 

8 

q24.22 

366 

WWOX 

279 

9.3 

-1 

1000 

NA 

NA 

NA 

NA 

NA 

NA 

359 

364 

365 

16 

q23.1 

367 

POU5F1B 

221 

2.9 

1 

989 

NA 

NA 

NA 

NA 

NA 

NA 

366 

358 

366 

8 

q24.21 

368 

OPHN1 

367 

5.8 

1 

999 

NA 

NA 

NA 

NA 

NA 

NA 

368 

368 

368 

23 

q12 

Table  6  (10b) 


359 

TSGA14 

129823611 

129868133 

0 

1 

13 

45149 

-8446 

-8446 

0 

NA 

NA 

4.49 

360 

CDC42BPG 

64348240 

64368617 

2 

1 

47 

80139 

-12898 

-12898 

0 

NA 

NA 

0.18 

361 

TSC22D4 

99902080 

99914838 

0 

1 

12 

4648 

-32404 

4648 

0 

NA 

NA 

0.11 

362 

NOTUM 

77503689 

77512353 

0 

1 

65 

16362 

-39903967 

16362 

0 

NA 

NA 

0.32 

363 

HSPB9 

37528361 

37528897 

0 

1 

64 

1627 

-1489 

-1489 

0 

NA 

NA 

0.44 

364 

TFR2 

100055975 

100077109 

0 

1 

12 

1569 

-5043 

1569 

0 

NA 

NA 

0.29 

365 

SLA 

134118155 

134184479 

0 

1 

46 

88015 

98170 

88015 

0 

0.14 

NA 

NA 

366 

WWOX 

76691052 

77803532 

2 

1 

52 

1391644 

-67557 

-67557 

0 

4.45 

NA 

NA 

367 

POU5F1B 

128497039 

128498621 

0 

1 

44 

318241 

-2323848 

318241 

0 

0.42 

NA 

NA 

368 

OPHN1 

67179440 

67570372 

NA 

1 

69 

NA 

-318596 

-318596 

0 

2.24 

NA 

NA 

